__init__(self,
corpus_file,
encoding,
tagged,
group_by_sent,
group_by_para,
sep,
word_tokenizer,
sent_tokenizer,
para_block_reader,
tag_mapping_function=None)
(Constructor)
| source code
|
Create a new corpus view, based on the file filename , and
read with block_reader . See the class documentation for
more information.
- Parameters:
filename - The path to the file that is read by this corpus view.
filename can either be a string or a PathPointer.
startpos - The file position at which the view will start reading. This can
be used to skip over preface sections.
encoding - The unicode encoding that should be used to read the file's
contents. If no encoding is specified, then the file's contents
will be read as a non-unicode string (i.e., a str ).
- Overrides:
util.StreamBackedCorpusView.__init__
- (inherited documentation)
|