Package nltk :: Package corpus :: Package reader :: Module chunked :: Class ChunkedCorpusView
[hide private]
[frames] | no frames]

Class ChunkedCorpusView

source code

               object --+        
                        |        
util.AbstractLazySequence --+    
                            |    
  util.StreamBackedCorpusView --+
                                |
                               ChunkedCorpusView

Instance Methods [hide private]
 
__init__(self, filename, encoding, tagged, group_by_sent, group_by_para, chunked, str2chunktree, sent_tokenizer, para_block_reader)
Create a new corpus view, based on the file filename, and read with block_reader.
source code
list of any
read_block(self, stream)
Read a block from the input stream.
source code
 
_untag(self, tree) source code

Inherited from util.StreamBackedCorpusView: __add__, __getitem__, __len__, __mul__, __radd__, __rmul__, close, iterate_from

Inherited from util.StreamBackedCorpusView (private): _open

Inherited from util.AbstractLazySequence: __cmp__, __contains__, __hash__, __iter__, __repr__, count, index

Inherited from object: __delattr__, __getattribute__, __new__, __reduce__, __reduce_ex__, __setattr__, __str__

Class Variables [hide private]

Inherited from util.AbstractLazySequence (private): _MAX_REPR_SIZE

Instance Variables [hide private]
Properties [hide private]

Inherited from util.StreamBackedCorpusView: filename

Inherited from object: __class__

Method Details [hide private]

__init__(self, filename, encoding, tagged, group_by_sent, group_by_para, chunked, str2chunktree, sent_tokenizer, para_block_reader)
(Constructor)

source code 

Create a new corpus view, based on the file filename, and read with block_reader. See the class documentation for more information.

Parameters:
  • filename - The path to the file that is read by this corpus view. filename can either be a string or a PathPointer.
  • startpos - The file position at which the view will start reading. This can be used to skip over preface sections.
  • encoding - The unicode encoding that should be used to read the file's contents. If no encoding is specified, then the file's contents will be read as a non-unicode string (i.e., a str).
Overrides: util.StreamBackedCorpusView.__init__
(inherited documentation)

read_block(self, stream)

source code 

Read a block from the input stream.

Parameters:
  • stream - an input stream
Returns: list of any
a block of tokens from the input stream
Overrides: util.StreamBackedCorpusView.read_block
(inherited documentation)