nltk.CRFTagger

class nltk.CRFTagger(feature_func=None, verbose=False, training_opt={})[source]

A module for POS tagging using CRFSuite https://pypi.python.org/pypi/python-crfsuite

>>> from nltk.tag import CRFTagger
>>> ct = CRFTagger()
>>> train_data = [[('University','Noun'), ('is','Verb'), ('a','Det'), ('good','Adj'), ('place','Noun')],
... [('dog','Noun'),('eat','Verb'),('meat','Noun')]]
>>> ct.train(train_data,'model.crf.tagger')
>>> ct.tag_sents([['dog','is','good'], ['Cat','eat','meat']])
[[('dog', 'Noun'), ('is', 'Verb'), ('good', 'Adj')], [('Cat', 'Noun'), ('eat', 'Verb'), ('meat', 'Noun')]]
>>> gold_sentences = [[('dog','Noun'),('is','Verb'),('good','Adj')] , [('Cat','Noun'),('eat','Verb'), ('meat','Noun')]] 
>>> ct.evaluate(gold_sentences) 
1.0

Setting learned model file >>> ct = CRFTagger() >>> ct.set_model_file(‘model.crf.tagger’) >>> ct.evaluate(gold_sentences) 1.0

Methods

__init__([feature_func, verbose, training_opt]) Initialize the CRFSuite tagger :param feature_func: The function that extracts features for each token of a sentence.
evaluate(gold) Score the accuracy of the tagger against the gold standard.
set_model_file(model_file)
tag(tokens) Tag a sentence using Python CRFSuite Tagger.
tag_sents(sents) Tag a list of sentences.
train(train_data, model_file) Train the CRF tagger using CRFSuite :params train_data : is the list of annotated sentences.