nltk.stem

NLTK Stemmers

Interfaces used to remove morphological affixes from words, leaving only the word stem. Stemming algorithms aim to remove those affixes required for eg. grammatical role, tense, derivational morphology leaving only the stem of the word. This is a difficult problem due to irregular words (eg. common verbs in English), complicated morphological rules, and part-of-speech and sense ambiguities (eg. ceil- is not the stem of ceiling).

StemmerI defines a standard interface for stemmers.

Classes

ISRIStemmer() ISRI Arabic stemmer based on algorithm: Arabic Stemming without a root dictionary.
LancasterStemmer() Lancaster Stemmer
PorterStemmer() A word stemmer based on the Porter stemming algorithm.
RSLPStemmer() A stemmer for Portuguese.
RegexpStemmer(regexp[, min]) A stemmer that uses regular expressions to identify morphological affixes.
SnowballStemmer(language[, ignore_stopwords]) Snowball Stemmer
StemmerI A processing interface for removing morphological affixes from words.
WordNetLemmatizer() WordNet Lemmatizer