Package nltk :: Package corpus :: Package reader :: Module bracket_parse :: Class BracketParseCorpusReader
[hide private]
[frames] | no frames]

Class BracketParseCorpusReader

source code

         object --+        
                  |        
   api.CorpusReader --+    
                      |    
util.SyntaxCorpusReader --+
                          |
                         BracketParseCorpusReader
Known Subclasses:

Reader for corpora that consist of parenthesis-delineated parse trees.

Instance Methods [hide private]
 
__init__(self, root, files, comment_char=None, detect_blocks='unindented_paren', encoding=None, tag_mapping_function=None)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
source code
 
_read_block(self, stream) source code
 
_normalize(self, t) source code
 
_parse(self, t) source code
 
_tag(self, t, simplify_tags=False) source code
 
_word(self, t) source code

Inherited from util.SyntaxCorpusReader: parsed_sents, raw, sents, tagged_sents, tagged_words, words

Inherited from api.CorpusReader: __repr__, abspath, abspaths, encoding, files, open

Inherited from api.CorpusReader (private): _get_root

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __str__

    Block Readers
    Deprecated since 0.8

Inherited from util.SyntaxCorpusReader: parsed, read, tagged, tokenized

    Deprecated since 0.9.1

Inherited from api.CorpusReader: filenames

Inherited from api.CorpusReader (private): _get_items

Instance Variables [hide private]

Inherited from api.CorpusReader (private): _encoding, _files, _root

Properties [hide private]

Inherited from api.CorpusReader: root

Inherited from object: __class__

    Deprecated since 0.9.1

Inherited from api.CorpusReader: items

Method Details [hide private]

__init__(self, root, files, comment_char=None, detect_blocks='unindented_paren', encoding=None, tag_mapping_function=None)
(Constructor)

source code 

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Parameters:
  • root - The root directory for this corpus.
  • files - A list or regexp specifying the files in this corpus.
  • comment_char - The character which can appear at the start of a line to indicate that the rest of the line is a comment.
  • detect_blocks - The method that is used to find blocks in the corpus; can be 'unindented_paren' (every unindented parenthesis starts a new parse) or 'sexpr' (brackets are matched).
Overrides: api.CorpusReader.__init__

_read_block(self, stream)

source code 
Overrides: util.SyntaxCorpusReader._read_block

_parse(self, t)

source code 
Overrides: util.SyntaxCorpusReader._parse

_tag(self, t, simplify_tags=False)

source code 
Overrides: util.SyntaxCorpusReader._tag

_word(self, t)

source code 
Overrides: util.SyntaxCorpusReader._word