Corpus reader for the propbank corpus, which augments the Penn
Treebank with information about the predicate argument structure of every
verb instance. The corpus consists of two parts: the predicate-argument
annotations themselves, and a set of frameset files
which define the argument labels used by the annotations, on a per-verb
basis. Each frameset file contains one or more predicates, such
as 'turn'
or 'turn_on'
, each of which is
divided into coarse-grained word senses called rolesets. For each roleset, the frameset
file provides descriptions of the argument roles, along with
examples.
|
__init__(self,
root,
propfile,
framefiles='
' ,
verbsfile=None,
parse_filename_xform=None,
parse_corpus=None,
encoding=None)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature |
source code
|
|
|
raw(self,
files=None)
Returns:
the text contents of the given files, as a single string. |
source code
|
|
|
|
|
lines(self)
Returns:
a corpus view that acts as a list of strings, one for each line in
the predicate-argument annotation file. |
source code
|
|
|
|
|
verbs(self)
Returns:
a corpus view that acts as a list of all verb lemmas in this corpus
(from the verbs.txt file). |
source code
|
|
|
|
Inherited from api.CorpusReader :
__repr__ ,
abspath ,
abspaths ,
encoding ,
files ,
open
Inherited from object :
__delattr__ ,
__getattribute__ ,
__hash__ ,
__new__ ,
__reduce__ ,
__reduce_ex__ ,
__setattr__ ,
__str__
|
Inherited from api.CorpusReader :
filenames
|