nltk.MWETokenizer.add_mwe

MWETokenizer.add_mwe(mwe)[source]

Add a multi-word expression to the lexicon (stored as a word trie)

We use util.Trie to represent the trie. Its form is a dict of dicts. The key True marks the end of a valid MWE.

Parameters:mwe (tuple(str) or list(str)) – The multi-word expression we’re adding into the word trie
Example:
>>> tokenizer = MWETokenizer()
>>> tokenizer.add_mwe(('a', 'b'))
>>> tokenizer.add_mwe(('a', 'b', 'c'))
>>> tokenizer.add_mwe(('a', 'x'))
>>> expected = {'a': {'x': {True: None}, 'b': {True: None, 'c': {True: None}}}}
>>> tokenizer._mwes.as_dict() == expected
True