nltk.MWETokenizer.__init__¶
-
MWETokenizer.
__init__
(mwes=None, separator='_')[source]¶ Initialize the multi-word tokenizer with a list of expressions and a separator
Parameters: - mwes (list(list(str))) – A sequence of multi-word expressions to be merged, where each MWE is a sequence of strings.
- separator (str) – String that should be inserted between words in a multi-word expression token. (Default is ‘_’)