nltk.NgramAssocMeasures
¶
-
class
nltk.
NgramAssocMeasures
[source]¶ An abstract class defining a collection of generic association measures. Each public method returns a score, taking the following arguments:
score_fn(count_of_ngram, (count_of_n-1gram_1, ..., count_of_n-1gram_j), (count_of_n-2gram_1, ..., count_of_n-2gram_k), ..., (count_of_1gram_1, ..., count_of_1gram_n), count_of_total_words)
See
BigramAssocMeasures
andTrigramAssocMeasures
Inheriting classes should define a property _n, and a method _contingency which calculates contingency values from marginals in order for all association measures defined here to be usable.
Methods¶
chi_sq (*marginals) |
Scores ngrams using Pearson’s chi-square as in Manning and Schutze 5.3.3. |
jaccard (*marginals) |
Scores ngrams using the Jaccard index. |
likelihood_ratio (*marginals) |
Scores ngrams using likelihood ratios as in Manning and Schutze 5.3.4. |
mi_like (*marginals, **kwargs) |
Scores ngrams using a variant of mutual information. |
pmi (*marginals) |
Scores ngrams by pointwise mutual information, as in Manning and Schutze 5.4. |
poisson_stirling (*marginals) |
Scores ngrams using the Poisson-Stirling measure. |
raw_freq (*marginals) |
Scores ngrams by their frequency |
student_t (*marginals) |
Scores ngrams using Student’s t test with independence hypothesis for unigrams, as in Manning and Schutze 5.3.1. |