Package nltk :: Package corpus :: Package reader :: Module propbank :: Class PropbankCorpusReader

Class PropbankCorpusReader

      object --+    
               |    
api.CorpusReader --+
                   |
                  PropbankCorpusReader

Corpus reader for the propbank corpus, which augments the Penn Treebank with information about the predicate argument structure of every verb instance. The corpus consists of two parts: the predicate-argument annotations themselves, and a set of frameset files which define the argument labels used by the annotations, on a per-verb basis. Each frameset file contains one or more predicates, such as 'turn' or 'turn_on', each of which is divided into coarse-grained word senses called rolesets. For each roleset, the frameset file provides descriptions of the argument roles, along with examples.

Instance Methods

[hide private]

__init__(self, root, propfile, framefiles='', verbsfile=None, parse_filename_xform=None, parse_corpus=None, encoding=None)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature source code

raw(self, files=None)
Returns: the text contents of the given files, as a single string.

source code

instances(self)
Returns: a corpus view that acts as a list of PropbankInstance objects, one for each verb in the corpus.

source code

lines(self)
Returns: a corpus view that acts as a list of strings, one for each line in the predicate-argument annotation file.

source code

roleset(self, roleset_id)
Returns: the xml description for the given roleset.

source code

verbs(self)
Returns: a corpus view that acts as a list of all verb lemmas in this corpus (from the verbs.txt file).

source code

_read_instance_block(self, stream)

source code

Inherited from api.CorpusReader: __repr__, abspath, abspaths, encoding, files, open

Inherited from api.CorpusReader (private): _get_root

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __str__

Deprecated since 0.9.1

Inherited from api.CorpusReader: filenames

Inherited from api.CorpusReader (private): _get_items

Instance Variables

[hide private]

Inherited from api.CorpusReader (private): _encoding, _files, _root

Properties

[hide private]

Inherited from api.CorpusReader: root

Inherited from object: __class__

Deprecated since 0.9.1

Inherited from api.CorpusReader: items

Method Details

[hide private]

init(self, root, propfile, framefiles=`''`, verbsfile=None, parse_filename_xform=None, parse_corpus=None, encoding=None)
(Constructor)

source code

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Parameters:

root - The root directory for this corpus.
propfile - The name of the file containing the predicate- argument annotations (relative to root).
framefiles - A list or regexp specifying the frameset files for this corpus.
parse_filename_xform - A transform that should be applied to the filenames in this corpus. This should be a function of one argument (a filename) that returns a string (the new filename).
parse_corpus - The corpus containing the parse trees corresponding to this corpus. These parse trees are necessary to resolve the tree pointers used by propbank.

Overrides: api.CorpusReader.__init__

raw(self, files=None)

source code

Returns:: the text contents of the given files, as a single string.

instances(self)

source code

Returns:: a corpus view that acts as a list of PropbankInstance objects, one for each verb in the corpus.

lines(self)

source code

Returns:: a corpus view that acts as a list of strings, one for each line in the predicate-argument annotation file.

roleset(self, roleset_id)

source code

Returns:: the xml description for the given roleset.

verbs(self)

source code

Returns:: a corpus view that acts as a list of all verb lemmas in this corpus (from the verbs.txt file).

Class PropbankCorpusReader

__init__(self, root, propfile, framefiles='', verbsfile=None, parse_filename_xform=None, parse_corpus=None, encoding=None) (Constructor)

raw(self, files=None)

instances(self)

lines(self)

roleset(self, roleset_id)

verbs(self)

init(self, root, propfile, framefiles=`''`, verbsfile=None, parse_filename_xform=None, parse_corpus=None, encoding=None)
(Constructor)