nltk.tag.ClassifierBasedTagger
¶
-
class
nltk.tag.
ClassifierBasedTagger
(feature_detector=None, train=None, classifier_builder=<bound method type.train of <class 'nltk.classify.naivebayes.NaiveBayesClassifier'>>, classifier=None, backoff=None, cutoff_prob=None, verbose=False)[source]¶ A sequential tagger that uses a classifier to choose the tag for each token in a sentence. The featureset input for the classifier is generated by a feature detector function:
feature_detector(tokens, index, history) -> featureset
Where tokens is the list of unlabeled tokens in the sentence; index is the index of the token for which feature detection should be performed; and history is list of the tags for all tokens before index.
Construct a new classifier-based sequential tagger.
Parameters: - feature_detector – A function used to generate the featureset input for the classifier:: feature_detector(tokens, index, history) -> featureset
- train – A tagged corpus consisting of a list of tagged sentences, where each sentence is a list of (word, tag) tuples.
- backoff – A backoff tagger, to be used by the new tagger if it encounters an unknown context.
- classifier_builder – A function used to train a new classifier based on the data in train. It should take one argument, a list of labeled featuresets (i.e., (featureset, label) tuples).
- classifier – The classifier that should be used by the tagger. This is only useful if you want to manually construct the classifier; normally, you would use train instead.
- backoff – A backoff tagger, used if this tagger is unable to determine a tag for a given token.
- cutoff_prob – If specified, then this tagger will fall back on its backoff tagger if the probability of the most likely tag is less than cutoff_prob.
Methods¶
__init__ ([feature_detector, train, ...]) |
|
choose_tag (tokens, index, history) |
|
classifier () |
Return the classifier that this tagger uses to choose a tag for each word in a sentence. |
evaluate (gold) |
Score the accuracy of the tagger against the gold standard. |
feature_detector (tokens, index, history) |
Return the feature detector that this tagger uses to generate featuresets for its classifier. |
tag (tokens) |
|
tag_one (tokens, index, history) |
Determine an appropriate tag for the specified token, and return that tag. |
tag_sents (sentences) |
Apply self.tag() to each element of sentences. |
unicode_repr () |