nltk.classify.NaiveBayesClassifier

class nltk.classify.NaiveBayesClassifier(label_probdist, feature_probdist)[source]

A Naive Bayes classifier. Naive Bayes classifiers are paramaterized by two probability distributions:

  • P(label) gives the probability that an input will receive each label, given no information about the input’s features.
  • P(fname=fval|label) gives the probability that a given feature (fname) will receive a given value (fval), given that the label (label).

If the classifier encounters an input with a feature that has never been seen with any label, then rather than assigning a probability of 0 to all labels, it will ignore that feature.

The feature value ‘None’ is reserved for unseen feature values; you generally should not use ‘None’ as a feature value for one of your own features.

Methods

__init__(label_probdist, feature_probdist)
param label_probdist:
 P(label), the probability distribution
classify(featureset)
classify_many(featuresets) Apply self.classify() to each element of featuresets.
labels()
most_informative_features([n]) Return a list of the ‘most informative’ features used by this classifier.
prob_classify(featureset)
prob_classify_many(featuresets) Apply self.prob_classify() to each element of featuresets.
show_most_informative_features([n])
train(labeled_featuresets[, estimator])
param labeled_featuresets:
 A list of classified featuresets,