Corpus reader for the propbank corpus, which augments the Penn
Treebank with information about the predicate argument structure of every
verb instance. The corpus consists of two parts: the predicate-argument
annotations themselves, and a set of frameset files
which define the argument labels used by the annotations, on a per-verb
basis. Each frameset file contains one or more predicates, such
as 'turn' or 'turn_on', each of which is
divided into coarse-grained word senses called rolesets. For each roleset, the frameset
file provides descriptions of the argument roles, along with
examples.
|
|
__init__(self,
root,
propfile,
framefiles='',
verbsfile=None,
parse_filename_xform=None,
parse_corpus=None,
encoding=None)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature |
source code
|
|
|
|
raw(self,
files=None)
Returns:
the text contents of the given files, as a single string. |
source code
|
|
|
|
|
|
|
lines(self)
Returns:
a corpus view that acts as a list of strings, one for each line in
the predicate-argument annotation file. |
source code
|
|
|
|
|
|
|
verbs(self)
Returns:
a corpus view that acts as a list of all verb lemmas in this corpus
(from the verbs.txt file). |
source code
|
|
|
|
|
|
Inherited from api.CorpusReader:
__repr__,
abspath,
abspaths,
encoding,
files,
open
Inherited from object:
__delattr__,
__getattribute__,
__hash__,
__new__,
__reduce__,
__reduce_ex__,
__setattr__,
__str__
|
|
Inherited from api.CorpusReader:
filenames
|