Module Hierarchy

nltk: NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets and tutorials supporting research and development in natural language processing.
- nltk.book
- nltk.cfg: Basic data classes for representing context free grammars.
- nltk.chat: A class for simple chatbots.
  - nltk.chat.eliza
  - nltk.chat.iesha: This chatbot is a tongue-in-cheek take on the average teen anime junky that frequents YahooMessenger or MSNM.
  - nltk.chat.rude
  - nltk.chat.suntsu
  - nltk.chat.util
  - nltk.chat.zen: Zen Chatbot talks in gems of Zen wisdom.
- nltk.chunk: Classes and interfaces for identifying non-overlapping linguistic groups (such as base noun phrases) in unrestricted text.
  - nltk.chunk.api
  - nltk.chunk.regexp
  - nltk.chunk.util
- nltk.classify: Classes and interfaces for labeling tokens with category labels (or class labels).
  - nltk.classify.api: Interfaces for labeling tokens with category labels (or class labels).
  - nltk.classify.decisiontree: A classifier model that decides which label to assign to a token on the basis of a tree structure, where branches correspond to conditions on feature values, and leaves correspond to label assignments.
  - nltk.classify.mallet: A set of functions used to interface with the external Mallet machine learning package.
  - nltk.classify.maxent: A classifier model based on maximum entropy modeling framework.
  - nltk.classify.megam: A set of functions used to interface with the external megam maxent optimization package.
  - nltk.classify.naivebayes: A classifier based on the Naive Bayes algorithm.
  - nltk.classify.util: Utility functions and classes for classifiers.
  - nltk.classify.weka: Classifiers that make use of the external 'Weka' package.
- nltk.cluster: This module contains a number of basic clustering algorithms.
  - nltk.cluster.api
  - nltk.cluster.em
  - nltk.cluster.gaac
  - nltk.cluster.kmeans
  - nltk.cluster.util
- nltk.compat: Backwards compatibility with previous versions of Python.
- nltk.containers
- nltk.corpus: NLTK corpus readers.
  - nltk.corpus.chat80: Chat-80 was a natural language system which allowed the user to interrogate a Prolog knowledge base in the domain of world geography.
  - nltk.corpus.reader: Corpus readers.
    - nltk.corpus.reader.api: API for corpus readers.
    - nltk.corpus.reader.bnc: Corpus reader for the XML version of the British National Corpus.
    - nltk.corpus.reader.bracket_parse
    - nltk.corpus.reader.chunked: A reader for corpora that contain chunked (and optionally tagged) documents.
    - nltk.corpus.reader.cmudict: The Carnegie Mellon Pronouncing Dictionary [cmudict.0.6] ftp://ftp.cs.cmu.edu/project/speech/dict/ Copyright 1998 Carnegie Mellon University
    - nltk.corpus.reader.conll: Read CoNLL-style chunk files.
    - nltk.corpus.reader.ieer: Corpus reader for the Information Extraction and Entity Recognition Corpus.
    - nltk.corpus.reader.indian: Indian Language POS-Tagged Corpus Collected by A Kumaran, Microsoft Research, India Distributed with permission
    - nltk.corpus.reader.nps_chat
    - nltk.corpus.reader.plaintext: A reader for corpora that consist of plaintext documents.
    - nltk.corpus.reader.ppattach: Read lines from the Prepositional Phrase Attachment Corpus.
    - nltk.corpus.reader.propbank
    - nltk.corpus.reader.rte: Corpus reader for the Recognizing Textual Entailment (RTE) Challenge Corpora.
    - nltk.corpus.reader.senseval: Read from the Senseval 2 Corpus.
    - nltk.corpus.reader.sinica_treebank: Sinica Treebank Corpus Sample
    - nltk.corpus.reader.string_category: Read tuples from a corpus consisting of categorized strings.
    - nltk.corpus.reader.tagged: A reader for corpora whose documents contain part-of-speech-tagged words.
    - nltk.corpus.reader.timit: Read tokens, phonemes and audio data from the NLTK TIMIT Corpus.
    - nltk.corpus.reader.toolbox: Module for reading, writing and manipulating Toolbox databases and settings files.
    - nltk.corpus.reader.util
    - nltk.corpus.reader.verbnet
    - nltk.corpus.reader.wordlist
    - nltk.corpus.reader.xmldocs: Corpus reader for corpora whose documents are xml files.
    - nltk.corpus.reader.ycoe: Corpus reader for the York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE), a 1.5 million word syntactically-annotated corpus of Old English prose texts.
  - nltk.corpus.util
- nltk.data: Functions to find and load NLTK resource files, such as corpora, grammars, and saved processing objects.
- nltk.decorators: Decorator module by Michele Simionato <michelesimionato@libero.it> Copyright Michele Simionato, distributed under the terms of the BSD License (see below).
- nltk.detect: Functions for detecting a token's features.
- nltk.draw: Tools for graphically displaying and interacting with the objects and processing classes defined by the Toolkit.
  - nltk.draw.cfg: Visualization tools for CFGs.
  - nltk.draw.chart: A graphical tool for exploring chart parsing.
  - nltk.draw.concordance
  - nltk.draw.dispersion: A utility for displaying lexical dispersion.
  - nltk.draw.plot: A simple tool for plotting functions.
  - nltk.draw.rdparser: A graphical tool for exploring the recursive descent parser.
  - nltk.draw.rechunkparser: A graphical tool for exploring the regular expression based chunk parser (RegexpChunkParser).
  - nltk.draw.srparser: A graphical tool for exploring the shift/reduce parser.
  - nltk.draw.table: Tkinter widgets for displaying multi-column listboxes and tables.
  - nltk.draw.tree: Graphically display a Tree.
- nltk.etree
  - nltk.etree.ElementInclude
  - nltk.etree.ElementPath
  - nltk.etree.ElementTree
- nltk.evaluate: Utility functions for evaluating processing modules.
- nltk.featstruct: Basic data classes for representing feature structures, and for performing basic operations on those feature structures.
- nltk.inference: Classes and interfaces for theorem proving and model building.
  - nltk.inference.api: Interfaces for theorem provers and model builders.
  - nltk.inference.discourse
  - nltk.inference.inference
  - nltk.inference.mace
  - nltk.inference.nonmonotonic: A module to perform nonmonotonic reasoning
  - nltk.inference.prover9
  - nltk.inference.resolution
  - nltk.inference.tableau
- nltk.internals
- nltk.misc
  - nltk.misc.chomsky
  - nltk.misc.nemo: Finding (and Replacing) Nemo
  - nltk.misc.sort: This module provides a variety of list sorting algorithms, to illustrate the many different algorithms (recipes) for solving a problem, and how to analyze algorithms experimentally.
  - nltk.misc.wordfinder
- nltk.model
  - nltk.model.api
  - nltk.model.ngram
- nltk.olac
- nltk.parse: Classes and interfaces for producing tree structures that represent the internal organization of a text.
  - nltk.parse.api
  - nltk.parse.chart
  - nltk.parse.featurechart: Extension of chart parsing implementation to handle grammars with feature structures as nodes.
  - nltk.parse.generate
  - nltk.parse.pchart: Classes and interfaces for associating probabilities with tree structures that represent the internal organization of a text.
  - nltk.parse.rd
  - nltk.parse.sr
  - nltk.parse.util: Utility functions for parsers.
  - nltk.parse.viterbi
- nltk.probability
- nltk.sem: This package contains classes for representing semantic structure in formulas of first-order logic and for evaluating such formulas in set-theoretic models.
  - nltk.sem.drt
  - nltk.sem.drt_resolve_anaphora: This module performs the anaphora resolution functionality for DRT.py.
  - nltk.sem.evaluate: This module provides data structures for representing first-order models.
  - nltk.sem.logic: A version of first order predicate logic, built on top of the untyped lambda calculus.
  - nltk.sem.relextract: Code for extracting relational triples from the ieer and conll2002 corpora.
  - nltk.sem.util: Utility functions for batch-processing sentences: parsing and extraction of the semantic representation of the root node of the the syntax tree, followed by evaluation of the semantic representation in a first-order model.
- nltk.stem: Interfaces used to remove morphological affixes from words, leaving only the word stem.
  - nltk.stem.api
  - nltk.stem.lancaster: A word stemmer based on the Lancaster stemming algorithm.
  - nltk.stem.porter: Porter Stemming Algorithm
  - nltk.stem.regexp
  - nltk.stem.rslp
  - nltk.stem.wordnet
- nltk.tag: Classes and interfaces for tagging each token of a sentence with supplementary information, such as its part of speech.
  - nltk.tag.api: Interface for tagging each token in a sentence with supplementary information, such as its part of speech.
  - nltk.tag.brill: Brill's transformational rule-based tagger.
  - nltk.tag.crf: An interface to Mallet's Linear Chain Conditional Random Field (LC-CRF) implementation.
  - nltk.tag.hmm: Hidden Markov Models (HMMs) largely used to assign the correct label sequence to sequential data or assess the probability of a given label and data sequence.
  - nltk.tag.sequential: Classes for tagging sentences sequentially, left to right.
  - nltk.tag.simplify
  - nltk.tag.util
- nltk.test: Unit tests for the NLTK modules.
  - nltk.test.coverage: Usage:
  - nltk.test.doctest_driver: A driver for testing interactive python examples in text files and docstrings.
- nltk.text
- nltk.tokenize: Functions for tokenizing, i.e., dividing text strings into substrings.
  - nltk.tokenize.api: Tokenizer Interface
  - nltk.tokenize.punkt: The Punkt sentence tokenizer.
  - nltk.tokenize.regexp: Tokenizers that divide strings into substrings using regular expressions that can match either tokens or separators between tokens.
  - nltk.tokenize.sexpr: A tokenizer that divides strings into s-expressions.
  - nltk.tokenize.simple: Tokenizers that divide strings into substrings using the string split() method.
- nltk.tree: Class for representing hierarchical language structures, such as syntax trees and morphological trees.
- nltk.treetransforms: A collection of methods for tree (grammar) transformations used in parsing natural language.
- nltk.util
- nltk.wordnet: Wordnet interface, based on Oliver Steele's Pywordnet, together with an implementation of Ted Pedersen's Wordnet::Similarity package.
  - nltk.wordnet.brown_ic
  - nltk.wordnet.browse: Natural Language Toolkit: Wordnet Interface: Wordnet Text Mode Browser See also the NLTK Wordnet Graphical Browser in nltk_contrib.wordnet
  - nltk.wordnet.browser
    - nltk.wordnet.browser.browserver: BrowServer is a server for browsing the NLTK Wordnet database It first launches a browser client to be used for browsing and then starts serving the requests of that and maybe other clients
    - nltk.wordnet.browser.util
    - nltk.wordnet.browser.wxbrowse: Wordnet Interface: Graphical Wordnet Browser
  - nltk.wordnet.cache
  - nltk.wordnet.dictionary
  - nltk.wordnet.frequency
  - nltk.wordnet.lexname
  - nltk.wordnet.similarity
  - nltk.wordnet.stemmer
  - nltk.wordnet.synset
  - nltk.wordnet.util
- nltk.yamltags
xml.etree.ElementTree