`nltk.FreqDist`¶

class nltk.FreqDist(samples=None)[source]¶

A frequency distribution for the outcomes of an experiment. A frequency distribution records the number of times each outcome of an experiment has occurred. For example, a frequency distribution could be used to record the frequency of each word type in a document. Formally, a frequency distribution can be defined as a function mapping from each sample to the number of times that sample occurred as an outcome.

Frequency distributions are generally constructed by running a number of experiments, and incrementing the count for a sample every time it is an outcome of an experiment. For example, the following code will produce a frequency distribution that encodes how often each word occurs in a text:

>>> from nltk.tokenize import word_tokenize
>>> from nltk.probability import FreqDist
>>> sent = 'This is an example sentence'
>>> fdist = FreqDist()
>>> for word in word_tokenize(sent):
...    fdist[word.lower()] += 1

An equivalent way to do this is with the initializer:

>>> fdist = FreqDist(word.lower() for word in word_tokenize(sent))

Methods¶

`B`()	Return the total number of sample values (or “bins”) that have counts greater than zero.
`N`()	Return the total number of sample outcomes that have been recorded by this FreqDist.
`Nr`(r[, bins])
`__init__`([samples])	Construct a new frequency distribution.
`clear`(() -> None. Remove all items from D.)
`copy`()	Create a copy of this frequency distribution.
`elements`()	Iterator over elements repeating each as many times as its count.
`freq`(sample)	Return the frequency of a given sample.
`fromkeys`(iterable[, v])
`get`((k[,d]) -> D[k] if k in D, ...)
`hapaxes`()	Return a list of all samples that occur once (hapax legomena)
`has_key`((k) -> True if D has a key k, else False)
`items`(() -> list of D’s (key, value) pairs, ...)
`iteritems`(() -> an iterator over the (key, ...)
`iterkeys`(() -> an iterator over the keys of D)
`itervalues`(...)
`keys`(() -> list of D’s keys)
`max`()	Return the sample with the greatest number of outcomes in this frequency distribution.
`most_common`([n])	List the n most common elements and their counts from the most common to the least.
`pformat`([maxlen])	Return a string representation of this FreqDist.
`plot`(args, *kwargs)	Plot samples from the frequency distribution displaying the most frequent sample first.
`pop`((k[,d]) -> v, ...)	If key is not found, d is returned if given, otherwise KeyError is raised
`popitem`(() -> (k, v), ...)	2-tuple; but raise KeyError if D is empty.
`pprint`([maxlen, stream])	Print a string representation of this FreqDist to ‘stream’
`r_Nr`([bins])	Return the dictionary mapping r to Nr, the number of samples with frequency r, where Nr > 0.
`setdefault`((k[,d]) -> D.get(k,d), ...)
`subtract`(args, *kwds)	Like dict.update() but subtracts counts instead of replacing them.
`tabulate`(args, *kwargs)	Tabulate the given samples from the frequency distribution (cumulative), displaying the most frequent sample first.
`unicode_repr`()	Return a string representation of this FreqDist.
`update`(args, *kwds)	Like dict.update() but add counts instead of replacing them.
`values`(() -> list of D’s values)
`viewitems`(...)
`viewkeys`(...)
`viewvalues`(...)

nltk.FreqDist¶

Methods¶

`nltk.FreqDist`¶