nltk.BigramCollocationFinder

class nltk.BigramCollocationFinder(word_fd, bigram_fd, window_size=2)[source]

A tool for the finding and ranking of bigram collocations or other association measures. It is often useful to use from_words() rather than constructing an instance directly.

Methods

__init__(word_fd, bigram_fd[, window_size]) Construct a BigramCollocationFinder, given FreqDists for appearances of words and (possibly non-contiguous) bigrams.
above_score(score_fn, min_score) Returns a sequence of ngrams, ordered by decreasing score, whose scores each exceed the given minimum score.
apply_freq_filter(min_freq) Removes candidate ngrams which have frequency less than min_freq.
apply_ngram_filter(fn) Removes candidate ngrams (w1, w2, ...) where fn(w1, w2, ...) evaluates to True.
apply_word_filter(fn) Removes candidate ngrams (w1, w2, ...) where any of (fn(w1), fn(w2), ...) evaluates to True.
from_documents(documents) Constructs a collocation finder given a collection of documents, each of which is a list (or iterable) of tokens.
from_words(words[, window_size]) Construct a BigramCollocationFinder for all bigrams in the given sequence.
nbest(score_fn, n) Returns the top n ngrams when scored by the given function.
score_ngram(score_fn, w1, w2) Returns the score for a given bigram using the given scoring function.
score_ngrams(score_fn) Returns a sequence of (ngram, score) pairs ordered from highest to lowest score, as determined by the scoring function provided.

Attributes

default_ws