The BrillTagger
class is a transformation-based tagger. It is the first tagger that is not a subclass of SequentialBackoffTagger
. Instead, the BrillTagger
class uses a series of rules to correct the results of an initial tagger. These rules are scored based on how many errors they correct minus the number of new errors they produce.
Here's a function from tag_util.py
that trains a BrillTagger
class using BrillTaggerTrainer
. It requires an initial_tagger
and train_sents
.
from nltk.tag import brill, brill_trainer def train_brill_tagger(initial_tagger, train_sents, **kwargs): templates = [ brill.Template(brill.Pos([-1])), brill.Template(brill.Pos([1])), brill.Template(brill.Pos([-2])), brill.Template(brill.Pos([2])), brill.Template(brill.Pos([-2, -1])), brill.Template(brill.Pos([1, 2])), brill.Template(brill.Pos([-3, -2, -1])), brill.Template(brill.Pos([1, 2, 3])), brill.Template(brill.Pos([-1]), brill.Pos([1])), brill.Template(brill.Word([-1])), brill.Template(brill.Word([1])), brill.Template(brill.Word([-2])), brill.Template(brill.Word([2])), brill.Template(brill.Word([-2, -1])), brill.Template(brill.Word([1, 2])), brill.Template(brill.Word([-3, -2, -1])), brill.Template(brill.Word([1, 2, 3])), brill.Template(brill.Word([-1]), brill.Word([1])), ] trainer = brill_trainer.BrillTaggerTrainer(initial_tagger, templates, deterministic=True) return trainer.train(train_sents, **kwargs)
To use it, we can create our initial_tagger
from a backoff chain of NgramTagger
classes, then pass that into the train_brill_tagger()
function to get a BrillTagger
back.
>>> default_tagger = DefaultTagger('NN') >>> initial_tagger = backoff_tagger(train_sents, [UnigramTagger, BigramTagger, TrigramTagger], backoff=default_tagger) >>> initial_tagger.evaluate(test_sents) 0.8806820634578028 >>> from tag_util import train_brill_tagger >>> brill_tagger = train_brill_tagger(initial_tagger, train_sents) >>> brill_tagger.evaluate(test_sents) 0.8827541549751781
So, the BrillTagger
class has slightly increased accuracy over the initial_tagger
.
The BrillTaggerTrainer
class takes an initial_tagger
argument and a list of templates. These templates must implement the BrillTemplateI
interface, which is found in the nltk.tbl.template
module. The brill.Template
class is such an implementation, and is actually imported from nltk.tbl.template
. The brill.Pos
and brill.Word
classes are subclasses of nltk.tbl.template.Feature
, and they describe what kind of features to use in the template, in this case, one or more part-of-speech tags or words.
The templates specify how to learn transformation rules. For example, brill.Template(brill.Pos([-1]))
means that a rule can be generated using the previous part-of-speech tag. The brill.Template(brill.Pos([1]))
statement means that you can look at the next part-of-speech tag to generate a rule. And brill.Template(brill.Word([-2, -1]))
means you can look at the combination of the previous two words to learn a transformation rule.
The thinking behind a transformation-based tagger is this: given the correct training sentences, the output of the initial tagger, and the templates specifying features, try to generate transformation rules that correct the initial tagger's output to be more in-line with the training sentences. The job of BrillTaggerTrainer
is to produce these rules, and to do so in a way that increases accuracy. A transformation rule that fixes one problem may cause an error in another condition; thus, every rule must be measured by how many errors it corrects versus how many new errors it introduces.
You can control the number of rules generated using the max_rules
keyword argument to the BrillTaggerTrainer.train()
method. The default value is 200
. You can also control the quality of rules used with the min_score
keyword argument. The default value is 2
, though 3
can be a good choice as well. The score is a measure of how well a rule corrects errors compared to how many new errors it introduces.
You can watch the BrillTaggerTrainer
class do its work by passing trace=True
into the constructor, for example, trainer = brill.BrillTaggerTrainer(initial_tagger, templates, deterministic=True, trace=True)
. This will give you the following output:
TBL train (fast) (seqs: 3000; tokens: 77511; tpls: 18; min score: 2; min acc: None) Finding initial useful rules... Found 9869 useful rules. Selecting rules...
This means it found 77511
rules with a score of at least min_score
, and then it selects the best rules, keeping no more than max_rules
.
The default is trace=False
, which means the trainer will work silently without printing its status.