nltk.classify.TypedMaxentFeatureEncoding.train¶
-
classmethod
TypedMaxentFeatureEncoding.
train
(train_toks, count_cutoff=0, labels=None, **options)[source]¶ Construct and return new feature encoding, based on a given training corpus
train_toks
. See the class descriptionTypedMaxentFeatureEncoding
for a description of the joint-features that will be included in this encoding.Note: recognized feature values types are (int, float), over types are interpreted as regular binary features.
Parameters: - train_toks (list(tuple(dict, str))) – Training data, represented as a list of pairs, the first member of which is a feature dictionary, and the second of which is a classification label.
- count_cutoff (int) – A cutoff value that is used to discard
rare joint-features. If a joint-feature’s value is 1
fewer than
count_cutoff
times in the training corpus, then that joint-feature is not included in the generated encoding. - labels (list) – A list of labels that should be used by the
classifier. If not specified, then the set of labels
attested in
train_toks
will be used. - options – Extra parameters for the constructor, such as
unseen_features
andalwayson_features
.