nltk.PerceptronTagger

class nltk.PerceptronTagger(load=True)[source]

Greedy Averaged Perceptron tagger, as implemented by Matthew Honnibal. See more implementation details here:

>>> from nltk.tag.perceptron import PerceptronTagger

Train the model

>>> tagger = PerceptronTagger(load=False)
>>> tagger.train([[('today','NN'),('is','VBZ'),('good','JJ'),('day','NN')],
... [('yes','NNS'),('it','PRP'),('beautiful','JJ')]])
>>> tagger.tag(['today','is','a','beautiful','day'])
[('today', 'NN'), ('is', 'PRP'), ('a', 'PRP'), ('beautiful', 'JJ'), ('day', 'NN')]

Use the pretrain model (the default constructor)

>>> pretrain = PerceptronTagger()
>>> pretrain.tag('The quick brown fox jumps over the lazy dog'.split())
[('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'VBZ'), ('over', 'IN'), ('the', 'DT'), ('lazy', 'JJ'), ('dog', 'NN')]
>>> pretrain.tag("The red cat".split())
[('The', 'DT'), ('red', 'JJ'), ('cat', 'NN')]

Methods

__init__([load])
param load:Load the pickled model upon instantiation.
evaluate(gold) Score the accuracy of the tagger against the gold standard.
load(loc)
param loc:Load a pickled model at location.
normalize(word) Normalization used in pre-processing.
tag(tokens) Tag tokenized sentences.
tag_sents(sentences) Apply self.tag() to each element of sentences.
train(sentences[, save_loc, nr_iter]) Train a model from sentences, and save it at save_loc.

Attributes

END
START
unicode_repr() <==> repr(x)