Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Combining classifiers with voting

One way to improve classification performance is to combine classifiers. The simplest way to combine multiple classifiers is to use voting, and choose whichever label gets the most votes. For this style of voting, it's best to have an odd number of classifiers so that there are no ties. This means combining at least three classifiers together. The individual classifiers should also use different algorithms; the idea is that multiple algorithms are better than one, and the combination of many can compensate for individual bias. However, combining a poorly performing classifier with better performing classifiers is generally not a good idea, because the poor performance of one classifier can bring the total accuracy down.

Getting ready

As we need to have at least three trained classifiers to combine, we are going to use a NaiveBayesClassifier class, a DecisionTreeClassifier class, and a MaxentClassifier class, all trained on the highest information words of the movie_reviews corpus. These were all trained in the previous recipe, so we will combine these three classifiers with voting.

How to do it...

In the classification.py module, there is a MaxVoteClassifier class:

import itertools
from nltk.classify import ClassifierI
from nltk.probability import FreqDist

class MaxVoteClassifier(ClassifierI):
  def __init__(self, *classifiers):
    self._classifiers = classifiers
    self._labels = sorted(set(itertools.chain(*[c.labels() for c in classifiers])))

  def labels(self):
    return self._labels

  def classify(self, feats):
    counts = FreqDist()

    for classifier in self._classifiers:
      counts[classifier.classify(feats)] += 1

    return counts.max()

To create it, you pass in a list of classifiers that you want to combine. Once created, it works just like any other classifier. Though it may take about three times longer to classify, it should generally be at least as accurate as any individual classifier.

>>> from classification import MaxVoteClassifier
>>> mv_classifier = MaxVoteClassifier(nb_classifier, dt_classifier, me_classifier, sk_classifier)
>>> mv_classifier.labels()
['neg', 'pos']
>>> accuracy(mv_classifier, test_feats)
0.894
>>> mv_precisions, mv_recalls = precision_recall(mv_classifier, test_feats)
>>> mv_precisions['pos']
0.9156118143459916
>>> mv_precisions['neg']
0.8745247148288974
>>> mv_recalls['pos']
0.868
>>> mv_recalls['neg']
0.92

These metrics are about on-par with the best sklearn classifiers, as well as the MaxentClassifier and NaiveBayesClassifier classes with high information features. Some numbers are slightly better, some worse. It's likely that a significant improvement to the DecisionTreeClassifier class could produce better numbers.

How it works...

The MaxVoteClassifier class extends the nltk.classify.ClassifierI interface, which requires the implementation of at least two methods:

The labels() method must return a list of possible labels. This will be the union of the labels() method of each classifier passed in at initialization.
The classify() method takes a single feature set and returns a label. The MaxVoteClassifier class iterates over its classifiers and calls classify() on each of them, recording their label as a vote in a FreqDist variable. The label with the most votes is returned using FreqDist.max().

The following is the inheritance diagram:

While it doesn't check for this, the MaxVoteClassifier class assumes that all the classifiers passed in at initialization use the same labels. Breaking this assumption may lead to odd behavior.

Table of Contents for
Combining classifiers with voting

Combining classifiers with voting

Getting ready

How to do it...

How it works...

See also

Table of Contents for Combining classifiers with voting

Create new playlist

Sign In

Sign Up

Combining classifiers with voting

Getting ready

How to do it...

How it works...

See also

Table of Contents for
Combining classifiers with voting