nltk.everygrams()

nltk.everygrams(sequence, min_len=1, max_len=-1, **kwargs)[source]

Returns all possible ngrams generated from a sequence of items, as an iterator.

>>> sent = 'a b c'.split()
>>> list(everygrams(sent))
[('a',), ('b',), ('c',), ('a', 'b'), ('b', 'c'), ('a', 'b', 'c')]
>>> list(everygrams(sent, max_len=2))
[('a',), ('b',), ('c',), ('a', 'b'), ('b', 'c')]
Parameters:
  • sequence (sequence or iter) – the source data to be converted into trigrams
  • min_len (int) – minimum length of the ngrams, aka. n-gram order/degree of ngram
  • max_len (int) – maximum length of the ngrams (set to length of sequence by default)
Return type:

iter(tuple)