Package nltk :: Package corpus :: Package reader
[hide private]
[frames] | no frames]

Package reader

source code

Corpus readers.

Submodules [hide private]

Classes [hide private]
  AlpinoCorpusReader
Reader for the Alpino Dutch Treebank.
  BNCCorpusReader
Corpus reader for the XML version of the British National Corpus.
  BracketParseCorpusReader
Reader for corpora that consist of parenthesis-delineated parse trees.
  CMUDictCorpusReader
  CategorizedCorpusReader
A mixin class used to aid in the implementation of corpus readers for categorized corpora.
  CategorizedPlaintextCorpusReader
A reader for plaintext corpora whose documents are divided into categories based on their file identifiers.
  CategorizedTaggedCorpusReader
A reader for part-of-speech tagged corpora whose documents are divided into categories based on their file identifiers.
  ChunkedCorpusReader
Reader for chunked (and optionally tagged) corpora.
  ConllChunkCorpusReader
A ConllCorpusReader whose data file contains three columns: words, pos, and chunk.
  ConllCorpusReader
A corpus reader for CoNLL-style files.
  CorpusReader
A base class for corpus reader classes, each of which can be used to read a specific corpus format.
  IEERCorpusReader
  IndianCorpusReader
List of words, one per line.
  MacMorphoCorpusReader
A corpus reader for the MAC_MORPHO corpus.
  NPSChatCorpusReader
  PPAttachmentCorpusReader
sentence_id verb noun1 preposition noun2 attachment
  PlaintextCorpusReader
Reader for corpora that consist of plaintext documents.
  PropbankCorpusReader
Corpus reader for the propbank corpus, which augments the Penn Treebank with information about the predicate argument structure of every verb instance.
  RTECorpusReader
Corpus reader for corpora in RTE challenges.
  SensevalCorpusReader
  SinicaTreebankCorpusReader
Reader for the sinica treebank.
  StringCategoryCorpusReader
  SyntaxCorpusReader
An abstract base class for reading corpora consisting of syntactically parsed text.
  TaggedCorpusReader
Reader for simple part-of-speech tagged corpora.
  TimitCorpusReader
Reader for the TIMIT corpus (or any other corpus with the same file layout and use of file formats).
  ToolboxCorpusReader
  VerbnetCorpusReader
  WordListCorpusReader
List of words, one per line.
  XMLCorpusReader
Corpus reader for corpora whose documents are xml files.
  YCOECorpusReader
Corpus reader for the York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE), a 1.5 million word syntactically-annotated corpus of Old English prose texts.
Functions [hide private]
 
find_corpus_files(root, regexp) source code
 
tagged_treebank_para_block_reader(stream) source code