nltk.stem
¶
NLTK Stemmers
Interfaces used to remove morphological affixes from words, leaving
only the word stem. Stemming algorithms aim to remove those affixes
required for eg. grammatical role, tense, derivational morphology
leaving only the stem of the word. This is a difficult problem due to
irregular words (eg. common verbs in English), complicated
morphological rules, and part-of-speech and sense ambiguities
(eg. ceil-
is not the stem of ceiling
).
StemmerI defines a standard interface for stemmers.
Classes¶
ISRIStemmer () |
ISRI Arabic stemmer based on algorithm: Arabic Stemming without a root dictionary. |
LancasterStemmer () |
Lancaster Stemmer |
PorterStemmer () |
A word stemmer based on the Porter stemming algorithm. |
RSLPStemmer () |
A stemmer for Portuguese. |
RegexpStemmer (regexp[, min]) |
A stemmer that uses regular expressions to identify morphological affixes. |
SnowballStemmer (language[, ignore_stopwords]) |
Snowball Stemmer |
StemmerI |
A processing interface for removing morphological affixes from words. |
WordNetLemmatizer () |
WordNet Lemmatizer |