nltk.BigramTagger

class nltk.BigramTagger(train=None, model=None, backoff=None, cutoff=0, verbose=False)[source]

A tagger that chooses a token’s tag based its word string and on the preceding words’ tag. In particular, a tuple consisting of the previous tag and the word is looked up in a table, and the corresponding tag is returned.

Parameters:
  • train (list(list(tuple(str, str)))) – The corpus of training data, a list of tagged sentences
  • model (dict) – The tagger model
  • backoff (TaggerI) – Another tagger which this tagger will consult when it is unable to tag a word
  • cutoff (int) – The number of instances of training data the tagger must see in order not to use the backoff tagger

Methods

__init__([train, model, backoff, cutoff, ...])
choose_tag(tokens, index, history)
context(tokens, index, history)
decode_json_obj(obj)
encode_json_obj()
evaluate(gold) Score the accuracy of the tagger against the gold standard.
size()
return:The number of entries in the table used by this
tag(tokens)
tag_one(tokens, index, history) Determine an appropriate tag for the specified token, and return that tag.
tag_sents(sentences) Apply self.tag() to each element of sentences.
unicode_repr()

Attributes

backoff The backoff tagger for this tagger.
json_tag