Home | Trees | Indices | Help |
|
---|
|
Code for extracting relational triples from the ieer and conll2002 corpora.
Relations are stored internally as dictionaries ('reldicts').
The two serialization outputs are rtuple and clause.
(subj, filler,
obj)
, where subj
and obj
are pairs
of Named Entity mentions, and filler
is the string of
words occurring between sub
and obj
(with
no intervening NEs). Strings are printed via repr()
to
circumvent locale variations in rendering utf-8 encoded strings.
relsym(subjsym,
objsym)
, where the relation, subject and object have been
canonicalized to single strings.
|
|||
str
|
|
||
str
|
|
||
str
|
|
||
|
|||
unicode
|
|
||
list of tuple
|
|
||
list of defaultdict
|
|
||
list of defaultdict
|
|
||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|
|||
NE_CLASSES =
|
|||
short2long =
|
|||
long2short =
|
|
Expand an NE class name.
|
Abbreviate an NE class name.
|
Join a list into a string, turning tags tuples into tag strings or just words.
|
Translate one entity to its ISO Latin value. Inspired by example from effbot.org |
Convert a list of strings into a canonical symbol.
|
Group a chunk structure into a list of pairs of the form (list(str), Tree) In order to facilitate the construction of (Tree, string, Tree) triples, this identifies pairs whose first member is a list (possibly empty) of terminal strings, and whose second member is a Tree of the form (NE_label, terminals).
|
Converts the pairs generated by mk_pairs into a 'reldict': a dictionary which stores information about the subject and object NEs plus the filler between them. Additionally, a left and right context of length =< window are captured (within a given input sentence).
|
Filter the output of mk_reldicts according to specified NE classes and a filler pattern. The parameters
|
Pretty print the reldict as an rtuple.
|
Print the relation in clausal form.
|
|
NE_CLASSES
|
short2long
|
long2short
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0beta1 on Wed Aug 27 15:08:50 2008 | http://epydoc.sourceforge.net |