nltk.RegexpStemmer
¶
-
class
nltk.
RegexpStemmer
(regexp, min=0)[source]¶ A stemmer that uses regular expressions to identify morphological affixes. Any substrings that match the regular expressions will be removed.
>>> from nltk.stem import RegexpStemmer >>> st = RegexpStemmer('ing$|s$|e$|able$', min=4) >>> st.stem('cars') 'car' >>> st.stem('mass') 'mas' >>> st.stem('was') 'was' >>> st.stem('bee') 'bee' >>> st.stem('compute') 'comput' >>> st.stem('advisable') 'advis'
Parameters: - regexp (str or regexp) – The regular expression that should be used to identify morphological affixes.
- min (int) – The minimum length of string to stem
Methods¶
__init__ (regexp[, min]) |
|
stem (word) |
|
unicode_repr () |