Package nltk :: Package stem
[hide private]
[frames] | no frames]

Package stem

source code

Interfaces used to remove morphological affixes from words, leaving only the word stem. Stemming algorithms aim to remove those affixes required for eg. grammatical role, tense, derivational morphology leaving only the stem of the word. This is a difficult problem due to irregular words (eg. common verbs in English), complicated morphological rules, and part-of-speech and sense ambiguities (eg. ceil- is not the stem of ceiling).

StemmerI defines a standard interface for stemmers.

Submodules [hide private]

Classes [hide private]
  RegexpStemmer
A stemmer that uses regular expressions to identify morphological affixes.
  PorterStemmer
A word stemmer based on the Porter stemming algorithm.
  LancasterStemmer
  WordnetStemmer
A stemmer that uses Wordnet's built-in morphy function.
  RSLPStemmer
A stemmer for Portuguese.
  StemmerI
A processing interface for removing morphological affixes from words.
    Deprecated
  StemI
Use nltk.StemmerI instead.
  Regexp
Use nltk.RegexpStemmer instead.
  Porter
Use nltk.PorterStemmer instead.
  Lancaster
Use nltk.LancasterStemmer instead.
  Wordnet
Use nltk.WordnetStemmer instead.