nltk.sem package¶
Submodules¶
nltk.sem.boxer module¶
An interface to Boxer.
This interface relies on the latest version of the development (subversion) version of C&C and Boxer.
- Usage:
Set the environment variable CANDC to the bin directory of your CandC installation. The models directory should be in the CandC root directory. For example:
- /path/to/candc/
- bin/
- candc boxer
- models/
- boxer/
-
class
nltk.sem.boxer.
Boxer
(boxer_drs_interpreter=None, elimeq=False, bin_dir=None, verbose=False, resolve=True)[source]¶ Bases:
object
This class is an interface to Johan Bos’s program Boxer, a wide-coverage semantic parser that produces Discourse Representation Structures (DRSs).
-
interpret
(input, discourse_id=None, question=False, verbose=False)[source]¶ Use Boxer to give a first order representation.
Parameters: - input – str Input sentence to parse
- occur_index – bool Should predicates be occurrence indexed?
- discourse_id – str An identifier to be inserted to each occurrence-indexed predicate.
Returns: drt.DrtExpression
-
interpret_multi
(input, discourse_id=None, question=False, verbose=False)[source]¶ Use Boxer to give a first order representation.
Parameters: - input – list of str Input sentences to parse as a single discourse
- occur_index – bool Should predicates be occurrence indexed?
- discourse_id – str An identifier to be inserted to each occurrence-indexed predicate.
Returns: drt.DrtExpression
-
interpret_multi_sents
(inputs, discourse_ids=None, question=False, verbose=False)[source]¶ Use Boxer to give a first order representation.
Parameters: - inputs – list of list of str Input discourses to parse
- occur_index – bool Should predicates be occurrence indexed?
- discourse_ids – list of str Identifiers to be inserted to each occurrence-indexed predicate.
Returns: drt.DrtExpression
-
interpret_sents
(inputs, discourse_ids=None, question=False, verbose=False)[source]¶ Use Boxer to give a first order representation.
Parameters: - inputs – list of str Input sentences to parse as individual discourses
- occur_index – bool Should predicates be occurrence indexed?
- discourse_ids – list of str Identifiers to be inserted to each occurrence-indexed predicate.
Returns: list of
drt.DrtExpression
-
-
class
nltk.sem.boxer.
BoxerCard
(discourse_id, sent_index, word_indices, var, value, type)[source]¶ Bases:
nltk.sem.boxer.BoxerIndexed
-
class
nltk.sem.boxer.
BoxerDrs
(refs, conds, consequent=None)[source]¶ Bases:
nltk.sem.boxer.AbstractBoxerDrs
-
unicode_repr
()¶
-
-
class
nltk.sem.boxer.
BoxerDrsParser
(discourse_id=None)[source]¶ Bases:
nltk.sem.drt.DrtParser
Reparse the str form of subclasses of
AbstractBoxerDrs
-
class
nltk.sem.boxer.
BoxerEq
(discourse_id, sent_index, word_indices, var1, var2)[source]¶ Bases:
nltk.sem.boxer.BoxerIndexed
-
class
nltk.sem.boxer.
BoxerIndexed
(discourse_id, sent_index, word_indices)[source]¶ Bases:
nltk.sem.boxer.AbstractBoxerDrs
-
unicode_repr
()¶
-
-
class
nltk.sem.boxer.
BoxerNamed
(discourse_id, sent_index, word_indices, var, name, type, sense)[source]¶ Bases:
nltk.sem.boxer.BoxerIndexed
-
class
nltk.sem.boxer.
BoxerNot
(drs)[source]¶ Bases:
nltk.sem.boxer.AbstractBoxerDrs
-
unicode_repr
()¶
-
-
class
nltk.sem.boxer.
BoxerOr
(discourse_id, sent_index, word_indices, drs1, drs2)[source]¶ Bases:
nltk.sem.boxer.BoxerIndexed
-
class
nltk.sem.boxer.
BoxerOutputDrsParser
(discourse_id=None)[source]¶ Bases:
nltk.sem.drt.DrtParser
-
class
nltk.sem.boxer.
BoxerPred
(discourse_id, sent_index, word_indices, var, name, pos, sense)[source]¶ Bases:
nltk.sem.boxer.BoxerIndexed
-
class
nltk.sem.boxer.
BoxerProp
(discourse_id, sent_index, word_indices, var, drs)[source]¶ Bases:
nltk.sem.boxer.BoxerIndexed
-
class
nltk.sem.boxer.
BoxerRel
(discourse_id, sent_index, word_indices, var1, var2, rel, sense)[source]¶ Bases:
nltk.sem.boxer.BoxerIndexed
-
class
nltk.sem.boxer.
BoxerWhq
(discourse_id, sent_index, word_indices, ans_types, drs1, variable, drs2)[source]¶ Bases:
nltk.sem.boxer.BoxerIndexed
nltk.sem.chat80 module¶
Overview¶
Chat-80 was a natural language system which allowed the user to
interrogate a Prolog knowledge base in the domain of world
geography. It was developed in the early ‘80s by Warren and Pereira; see
http://www.aclweb.org/anthology/J82-3002.pdf
for a description and
http://www.cis.upenn.edu/~pereira/oldies.html
for the source
files.
This module contains functions to extract data from the Chat-80
relation files (‘the world database’), and convert then into a format
that can be incorporated in the FOL models of
nltk.sem.evaluate
. The code assumes that the Prolog
input files are available in the NLTK corpora directory.
The Chat-80 World Database consists of the following files:
world0.pl
rivers.pl
cities.pl
countries.pl
contain.pl
borders.pl
This module uses a slightly modified version of world0.pl
, in which
a set of Prolog rules have been omitted. The modified file is named
world1.pl
. Currently, the file rivers.pl
is not read in, since
it uses a list rather than a string in the second field.
Reading Chat-80 Files¶
Chat-80 relations are like tables in a relational database. The
relation acts as the name of the table; the first argument acts as the
‘primary key’; and subsequent arguments are further fields in the
table. In general, the name of the table provides a label for a unary
predicate whose extension is all the primary keys. For example,
relations in cities.pl
are of the following form:
'city(athens,greece,1368).'
Here, 'athens'
is the key, and will be mapped to a member of the
unary predicate city.
The fields in the table are mapped to binary predicates. The first
argument of the predicate is the primary key, while the second
argument is the data in the relevant field. Thus, in the above
example, the third field is mapped to the binary predicate
population_of, whose extension is a set of pairs such as
'(athens, 1368)'
.
An exception to this general framework is required by the relations in
the files borders.pl
and contains.pl
. These contain facts of the
following form:
'borders(albania,greece).'
'contains0(africa,central_africa).'
We do not want to form a unary concept out the element in
the first field of these records, and we want the label of the binary
relation just to be 'border'
/'contain'
respectively.
In order to drive the extraction process, we use ‘relation metadata bundles’ which are Python dictionaries such as the following:
city = {'label': 'city',
'closures': [],
'schema': ['city', 'country', 'population'],
'filename': 'cities.pl'}
According to this, the file city['filename']
contains a list of
relational tuples (or more accurately, the corresponding strings in
Prolog form) whose predicate symbol is city['label']
and whose
relational schema is city['schema']
. The notion of a closure
is
discussed in the next section.
Concepts¶
In order to encapsulate the results of the extraction, a class of
Concept
objects is introduced. A Concept
object has a number of
attributes, in particular a prefLabel
and extension
, which make
it easier to inspect the output of the extraction. In addition, the
extension
can be further processed: in the case of the 'border'
relation, we check that the relation is symmetric, and in the case
of the 'contain'
relation, we carry out the transitive
closure. The closure properties associated with a concept is
indicated in the relation metadata, as indicated earlier.
The extension
of a Concept
object is then incorporated into a
Valuation
object.
Persistence¶
The functions val_dump
and val_load
are provided to allow a
valuation to be stored in a persistent database and re-loaded, rather
than having to be re-computed each time.
Individuals and Lexical Items¶
As well as deriving relations from the Chat-80 data, we also create a
set of individual constants, one for each entity in the domain. The
individual constants are string-identical to the entities. For
example, given a data item such as 'zloty'
, we add to the valuation
a pair ('zloty', 'zloty')
. In order to parse English sentences that
refer to these entities, we also create a lexical item such as the
following for each individual constant:
PropN[num=sg, sem=<\P.(P zloty)>] -> 'Zloty'
The set of rules is written to the file chat_pnames.cfg
in the
current directory.
-
class
nltk.sem.chat80.
Concept
(prefLabel, arity, altLabels=[], closures=[], extension=set())[source]¶ Bases:
object
A Concept class, loosely based on SKOS (http://www.w3.org/TR/swbp-skos-core-guide/).
-
augment
(data)[source]¶ Add more data to the
Concept
‘s extension set.Parameters: data (string or pair of strings) – a new semantic value Return type: set
-
close
()[source]¶ Close a binary relation in the
Concept
‘s extension set.Returns: a new extension for the Concept
in which the relation is closed under a given property
-
unicode_repr
()¶
-
-
nltk.sem.chat80.
binary_concept
(label, closures, subj, obj, records)[source]¶ Make a binary concept out of the primary key and another field in a record.
A record is a list of entities in some relation, such as
['france', 'paris']
, where'france'
is acting as the primary key, and'paris'
stands in the'capital_of'
relation to'france'
.More generally, given a record such as
['a', 'b', 'c']
, where label is bound to'B'
, andobj
bound to 1, the derived binary concept will have label'B_of'
, and its extension will be a set of pairs such as('a', 'b')
.Parameters: - label (str) – the base part of the preferred label for the concept
- closures (list) – closure properties for the extension of the concept
- subj (int) – position in the record of the subject of the predicate
- obj (int) – position in the record of the object of the predicate
- records (list of lists) – a list of records
Returns: Concept
of arity 2Return type:
-
nltk.sem.chat80.
cities2table
(filename, rel_name, dbname, verbose=False, setup=False)[source]¶ Convert a file of Prolog clauses into a database table.
This is not generic, since it doesn’t allow arbitrary schemas to be set as a parameter.
Intended usage:
cities2table('cities.pl', 'city', 'city.db', verbose=True, setup=True)
Parameters:
-
nltk.sem.chat80.
clause2concepts
(filename, rel_name, schema, closures=[])[source]¶ Convert a file of Prolog clauses into a list of
Concept
objects.Parameters: Returns: a list of
Concept
objectsReturn type:
-
nltk.sem.chat80.
concepts
(items=('borders', 'circle_of_lat', 'circle_of_long', 'city', 'contains', 'continent', 'country', 'ocean', 'region', 'sea'))[source]¶ Build a list of concepts corresponding to the relation names in
items
.Parameters: items (list(str)) – names of the Chat-80 relations to extract Returns: the Concept
objects which are extracted from the relationsReturn type: list(Concept)
-
nltk.sem.chat80.
label_indivs
(valuation, lexicon=False)[source]¶ Assign individual constants to the individuals in the domain of a
Valuation
.Given a valuation with an entry of the form
{'rel': {'a': True}}
, add a new entry{'a': 'a'}
.Return type: Valuation
-
nltk.sem.chat80.
make_lex
(symbols)[source]¶ Create lexical CFG rules for each individual symbol.
Given a valuation with an entry of the form
{'zloty': 'zloty'}
, create a lexical rule for the proper name ‘Zloty’.Parameters: symbols (sequence -- set(str)) – a list of individual constants in the semantic representation Return type: list(str)
-
nltk.sem.chat80.
make_valuation
(concepts, read=False, lexicon=False)[source]¶ Convert a list of
Concept
objects into a list of (label, extension) pairs; optionally create aValuation
object.Parameters: Return type:
-
nltk.sem.chat80.
process_bundle
(rels)[source]¶ Given a list of relation metadata bundles, make a corresponding dictionary of concepts, indexed by the relation name.
Parameters: rels (list(dict)) – bundle of metadata needed for constructing a concept Returns: a dictionary of concepts, indexed by the relation name. Return type: dict(str): Concept
-
nltk.sem.chat80.
sql_query
(dbname, query)[source]¶ Execute an SQL query over a database. :param dbname: filename of persistent store :type schema: str :param query: SQL query :type rel_name: str
-
nltk.sem.chat80.
unary_concept
(label, subj, records)[source]¶ Make a unary concept out of the primary key in a record.
A record is a list of entities in some relation, such as
['france', 'paris']
, where'france'
is acting as the primary key.Parameters: - label (string) – the preferred label for the concept
- subj (int) – position in the record of the subject of the predicate
- records (list of lists) – a list of records
Returns: Concept
of arity 1Return type:
-
nltk.sem.chat80.
val_dump
(rels, db)[source]¶ Make a
Valuation
from a list of relation metadata bundles and dump to persistent database.Parameters: - rels (list of dict) – bundle of metadata needed for constructing a concept
- db (str) – name of file to which data is written. The suffix ‘.db’ will be automatically appended.
nltk.sem.cooper_storage module¶
-
class
nltk.sem.cooper_storage.
CooperStore
(featstruct)[source]¶ Bases:
object
A container for handling quantifier ambiguity via Cooper storage.
-
s_retrieve
(trace=False)[source]¶ Carry out S-Retrieval of binding operators in store. If hack=True, serialize the bindop and core as strings and reparse. Ugh.
Each permutation of the store (i.e. list of binding operators) is taken to be a possible scoping of quantifiers. We iterate through the binding operators in each permutation, and successively apply them to the current term, starting with the core semantic representation, working from the inside out.
Binding operators are of the form:
bo(\P.all x.(man(x) -> P(x)),z1)
-
nltk.sem.drt module¶
-
class
nltk.sem.drt.
DRS
(refs, conds, consequent=None)[source]¶ Bases:
nltk.sem.drt.DrtExpression
,nltk.sem.logic.Expression
A Discourse Representation Structure.
-
replace
(variable, expression, replace_bound=False, alpha_convert=True)[source]¶ Replace all instances of variable v with expression E in self, where v is free in self.
-
unicode_repr
()¶
-
-
class
nltk.sem.drt.
DrsDrawer
(drs, size_canvas=True, canvas=None)[source]¶ Bases:
object
-
BUFFER
= 3¶
-
OUTERSPACE
= 6¶
-
TOPSPACE
= 10¶
-
-
class
nltk.sem.drt.
DrtAbstractVariableExpression
(variable)[source]¶ Bases:
nltk.sem.drt.DrtExpression
,nltk.sem.logic.AbstractVariableExpression
-
class
nltk.sem.drt.
DrtApplicationExpression
(function, argument)[source]¶ Bases:
nltk.sem.drt.DrtExpression
,nltk.sem.logic.ApplicationExpression
-
class
nltk.sem.drt.
DrtBinaryExpression
(first, second)[source]¶ Bases:
nltk.sem.drt.DrtExpression
,nltk.sem.logic.BinaryExpression
-
class
nltk.sem.drt.
DrtBooleanExpression
(first, second)[source]¶ Bases:
nltk.sem.drt.DrtBinaryExpression
,nltk.sem.logic.BooleanExpression
-
class
nltk.sem.drt.
DrtConcatenation
(first, second, consequent=None)[source]¶ Bases:
nltk.sem.drt.DrtBooleanExpression
DRS of the form ‘(DRS + DRS)’
-
replace
(variable, expression, replace_bound=False, alpha_convert=True)[source]¶ Replace all instances of variable v with expression E in self, where v is free in self.
-
unicode_repr
()¶
-
-
class
nltk.sem.drt.
DrtConstantExpression
(variable)[source]¶ Bases:
nltk.sem.drt.DrtAbstractVariableExpression
,nltk.sem.logic.ConstantExpression
-
class
nltk.sem.drt.
DrtEqualityExpression
(first, second)[source]¶ Bases:
nltk.sem.drt.DrtBinaryExpression
,nltk.sem.logic.EqualityExpression
-
class
nltk.sem.drt.
DrtEventVariableExpression
(variable)[source]¶ Bases:
nltk.sem.drt.DrtIndividualVariableExpression
,nltk.sem.logic.EventVariableExpression
-
class
nltk.sem.drt.
DrtExpression
[source]¶ Bases:
object
This is the base abstract DRT Expression from which every DRT Expression extends.
-
equiv
(other, prover=None)[source]¶ Check for logical equivalence. Pass the expression (self <-> other) to the theorem prover. If the prover says it is valid, then the self and other are equal.
Parameters: - other – an
DrtExpression
to check equality against - prover – a
nltk.inference.api.Prover
- other – an
-
get_refs
(recursive=False)[source]¶ Return the set of discourse referents in this DRS. :param recursive: bool Also find discourse referents in subterms? :return: list of
Variable
objects
-
type
¶
-
-
class
nltk.sem.drt.
DrtFunctionVariableExpression
(variable)[source]¶ Bases:
nltk.sem.drt.DrtAbstractVariableExpression
,nltk.sem.logic.FunctionVariableExpression
-
class
nltk.sem.drt.
DrtIndividualVariableExpression
(variable)[source]¶ Bases:
nltk.sem.drt.DrtAbstractVariableExpression
,nltk.sem.logic.IndividualVariableExpression
-
class
nltk.sem.drt.
DrtLambdaExpression
(variable, term)[source]¶ Bases:
nltk.sem.drt.DrtExpression
,nltk.sem.logic.LambdaExpression
-
class
nltk.sem.drt.
DrtNegatedExpression
(term)[source]¶ Bases:
nltk.sem.drt.DrtExpression
,nltk.sem.logic.NegatedExpression
-
class
nltk.sem.drt.
DrtOrExpression
(first, second)[source]¶ Bases:
nltk.sem.drt.DrtBooleanExpression
,nltk.sem.logic.OrExpression
-
class
nltk.sem.drt.
DrtParser
[source]¶ Bases:
nltk.sem.logic.LogicParser
A lambda calculus expression parser.
-
get_BooleanExpression_factory
(tok)[source]¶ This method serves as a hook for other logic parsers that have different boolean operators
-
handle
(tok, context)[source]¶ This method is intended to be overridden for logics that use different operators or expressions
-
-
class
nltk.sem.drt.
DrtProposition
(variable, drs)[source]¶ Bases:
nltk.sem.drt.DrtExpression
,nltk.sem.logic.Expression
-
unicode_repr
()¶
-
-
class
nltk.sem.drt.
DrtTokens
[source]¶ Bases:
nltk.sem.logic.Tokens
-
CLOSE_BRACKET
= ']'¶
-
COLON
= ':'¶
-
DRS
= 'DRS'¶
-
DRS_CONC
= '+'¶
-
OPEN_BRACKET
= '['¶
-
PRONOUN
= 'PRO'¶
-
PUNCT
= ['+', '[', ']', ':']¶
-
SYMBOLS
= ['&', '^', '|', '->', '=>', '<->', '<=>', '=', '==', '!=', '\\', '.', '(', ')', ',', '-', '!', '+', '[', ']', ':']¶
-
TOKENS
= ['and', '&', '^', 'or', '|', 'implies', '->', '=>', 'iff', '<->', '<=>', '=', '==', '!=', 'some', 'exists', 'exist', 'all', 'forall', '\\', '.', '(', ')', ',', 'not', '-', '!', 'DRS', '+', '[', ']', ':']¶
-
-
nltk.sem.drt.
DrtVariableExpression
(variable)[source]¶ This is a factory method that instantiates and returns a subtype of
DrtAbstractVariableExpression
appropriate for the given variable.
-
class
nltk.sem.drt.
PossibleAntecedents
[source]¶ Bases:
list
,nltk.sem.drt.DrtExpression
,nltk.sem.logic.Expression
-
replace
(variable, expression, replace_bound=False, alpha_convert=True)[source]¶ Replace all instances of variable v with expression E in self, where v is free in self.
-
unicode_repr
¶ Return repr(self).
-
nltk.sem.drt_glue_demo module¶
-
class
nltk.sem.drt_glue_demo.
DrtGlueDemo
(examples)[source]¶ Bases:
object
nltk.sem.evaluate module¶
This module provides data structures for representing first-order models.
-
class
nltk.sem.evaluate.
Assignment
(domain, assign=None)[source]¶ Bases:
dict
A dictionary which represents an assignment of values to variables.
An assigment can only assign values from its domain.
If an unknown expression a is passed to a model M‘s interpretation function i, i will first check whether M‘s valuation assigns an interpretation to a as a constant, and if this fails, i will delegate the interpretation of a to g. g only assigns values to individual variables (i.e., members of the class
IndividualVariableExpression
in thelogic
module. If a variable is not assigned a value by g, it will raise anUndefined
exception.A variable Assignment is a mapping from individual variables to entities in the domain. Individual variables are usually indicated with the letters
'x'
,'y'
,'w'
and'z'
, optionally followed by an integer (e.g.,'x0'
,'y332'
). Assignments are created using theAssignment
constructor, which also takes the domain as a parameter.>>> from nltk.sem.evaluate import Assignment >>> dom = set(['u1', 'u2', 'u3', 'u4']) >>> g3 = Assignment(dom, [('x', 'u1'), ('y', 'u2')]) >>> g3 == {'x': 'u1', 'y': 'u2'} True
There is also a
print
format for assignments which uses a notation closer to that in logic textbooks:>>> print(g3) g[u1/x][u2/y]
It is also possible to update an assignment using the
add
method:>>> dom = set(['u1', 'u2', 'u3', 'u4']) >>> g4 = Assignment(dom) >>> g4.add('x', 'u1') {'x': 'u1'}
With no arguments,
purge()
is equivalent toclear()
on a dictionary:>>> g4.purge() >>> g4 {}
Parameters: -
purge
(var=None)[source]¶ Remove one or all keys (i.e. logic variables) from an assignment, and update
self.variant
.Parameters: var – a Variable acting as a key for the assignment.
-
unicode_repr
¶ Return repr(self).
-
-
class
nltk.sem.evaluate.
Model
(domain, valuation)[source]¶ Bases:
object
A first order model is a domain D of discourse and a valuation V.
A domain D is a set, and a valuation V is a map that associates expressions with values in the model. The domain of V should be a subset of D.
Construct a new
Model
.Parameters: -
evaluate
(expr, g, trace=None)[source]¶ Read input expressions, and provide a handler for
satisfy
that blocks further propagation of theUndefined
error. :param expr: AnExpression
oflogic
. :type g: Assignment :param g: an assignment to individual variables. :rtype: bool or ‘Undefined’
-
i
(parsed, g, trace=False)[source]¶ An interpretation function.
Assuming that
parsed
is atomic:- if
parsed
is a non-logical constant, calls the valuation V - else if
parsed
is an individual variable, calls assignment g - else returns
Undefined
.
Parameters: - parsed – an
Expression
oflogic
. - g (Assignment) – an assignment to individual variables.
Returns: a semantic value
- if
-
satisfiers
(parsed, varex, g, trace=None, nesting=0)[source]¶ Generate the entities from the model’s domain that satisfy an open formula.
Parameters: - parsed (Expression) – an open formula
- varex (VariableExpression or str) – the relevant free individual variable in
parsed
. - g (Assignment) – a variable assignment
Returns: a set of the entities that satisfy
parsed
.
-
satisfy
(parsed, g, trace=None)[source]¶ Recursive interpretation function for a formula of first-order logic.
Raises an
Undefined
error whenparsed
is an atomic string but is not a symbol or an individual variable.Returns: Returns a truth value or
Undefined
ifparsed
is complex, and calls the interpretation functioni
ifparsed
is atomic.Parameters: - parsed – An expression of
logic
. - g (Assignment) – an assignment to individual variables.
- parsed – An expression of
-
unicode_repr
()¶
-
-
exception
nltk.sem.evaluate.
Undefined
[source]¶ Bases:
nltk.sem.evaluate.Error
-
class
nltk.sem.evaluate.
Valuation
(xs)[source]¶ Bases:
dict
A dictionary which represents a model-theoretic Valuation of non-logical constants. Keys are strings representing the constants to be interpreted, and values correspond to individuals (represented as strings) and n-ary relations (represented as sets of tuples of strings).
An instance of
Valuation
will raise a KeyError exception (i.e., just behave like a standard dictionary) if indexed with an expression that is not in its list of symbols.-
domain
¶ Set-theoretic domain of the value-space of a Valuation.
-
symbols
¶ The non-logical constants which the Valuation recognizes.
-
unicode_repr
¶ Return repr(self).
-
-
nltk.sem.evaluate.
arity
(rel)[source]¶ Check the arity of a relation. :type rel: set of tuples :rtype: int of tuple of str
-
nltk.sem.evaluate.
demo
(num=0, trace=None)[source]¶ Run exists demos.
- num = 1: propositional logic demo
- num = 2: first order model demo (only if trace is set)
- num = 3: first order sentences demo
- num = 4: satisfaction of open formulas demo
- any other value: run all the demos
Parameters: trace – trace = 1, or trace = 2 for more verbose tracing
-
nltk.sem.evaluate.
foldemo
(trace=None)[source]¶ Interpretation of closed expressions in a first-order model.
-
nltk.sem.evaluate.
is_rel
(s)[source]¶ Check whether a set represents a relation (of any arity).
Parameters: s (set) – a set containing tuples of str elements Return type: bool
-
nltk.sem.evaluate.
read_valuation
(s, encoding=None)[source]¶ Convert a valuation string into a valuation.
Parameters: Returns: a
nltk.sem
valuationReturn type:
-
nltk.sem.evaluate.
satdemo
(trace=None)[source]¶ Satisfiers of an open formula in a first order model.
-
nltk.sem.evaluate.
set2rel
(s)[source]¶ Convert a set containing individuals (strings or numbers) into a set of unary tuples. Any tuples of strings already in the set are passed through unchanged.
- For example:
- set([‘a’, ‘b’]) => set([(‘a’,), (‘b’,)])
- set([3, 27]) => set([(‘3’,), (‘27’,)])
Return type: set of tuple of str
nltk.sem.glue module¶
-
class
nltk.sem.glue.
DrtGlue
(semtype_file=None, remove_duplicates=False, depparser=None, verbose=False)[source]¶ Bases:
nltk.sem.glue.Glue
-
class
nltk.sem.glue.
DrtGlueDict
(filename, encoding=None)[source]¶ Bases:
nltk.sem.glue.GlueDict
-
class
nltk.sem.glue.
DrtGlueFormula
(meaning, glue, indices=None)[source]¶ Bases:
nltk.sem.glue.GlueFormula
-
class
nltk.sem.glue.
Glue
(semtype_file=None, remove_duplicates=False, depparser=None, verbose=False)[source]¶ Bases:
object
-
dep_parse
(sentence)[source]¶ Return a dependency graph for the sentence.
Parameters: sentence (list(str)) – the sentence to be parsed Return type: DependencyGraph
-
-
class
nltk.sem.glue.
GlueDict
(filename, encoding=None)[source]¶ Bases:
dict
-
get_label
(node)[source]¶ Pick an alphabetic character as identifier for an entity in the model.
Parameters: value (int) – where to index into the list of characters
-
get_meaning_formula
(generic, word)[source]¶ Parameters: generic – A meaning formula string containing the parameter “<word>” :param word: The actual word to be replace “<word>”
-
get_semtypes
(node)[source]¶ Based on the node, return a list of plausible semtypes in order of plausibility.
-
lookup_unique
(rel, node, depgraph)[source]¶ Lookup ‘key’. There should be exactly one item in the associated relation.
-
unicode_repr
¶ Return repr(self).
-
nltk.sem.hole module¶
An implementation of the Hole Semantics model, following Blackburn and Bos, Representation and Inference for Natural Language (CSLI, 2005).
The semantic representations are built by the grammar hole.fcfg. This module contains driver code to read in sentences and parse them according to a hole semantics grammar.
After parsing, the semantic representation is in the form of an underspecified representation that is not easy to read. We use a “plugging” algorithm to convert that representation into first-order logic formulas.
-
class
nltk.sem.hole.
Constants
[source]¶ Bases:
object
-
ALL
= 'ALL'¶
-
AND
= 'AND'¶
-
EXISTS
= 'EXISTS'¶
-
HOLE
= 'HOLE'¶
-
IFF
= 'IFF'¶
-
IMP
= 'IMP'¶
-
LABEL
= 'LABEL'¶
-
LEQ
= 'LEQ'¶
-
MAP
= {'ALL': <function Constants.<lambda>>, 'OR': <class 'nltk.sem.logic.OrExpression'>, 'PRED': <class 'nltk.sem.logic.ApplicationExpression'>, 'NOT': <class 'nltk.sem.logic.NegatedExpression'>, 'AND': <class 'nltk.sem.logic.AndExpression'>, 'IFF': <class 'nltk.sem.logic.IffExpression'>, 'EXISTS': <function Constants.<lambda>>, 'IMP': <class 'nltk.sem.logic.ImpExpression'>}¶
-
NOT
= 'NOT'¶
-
OR
= 'OR'¶
-
PRED
= 'PRED'¶
-
-
class
nltk.sem.hole.
Constraint
(lhs, rhs)[source]¶ Bases:
object
This class represents a constraint of the form (L =< N), where L is a label and N is a node (a label or a hole).
-
unicode_repr
()¶
-
-
class
nltk.sem.hole.
HoleSemantics
(usr)[source]¶ Bases:
object
This class holds the broken-down components of a hole semantics, i.e. it extracts the holes, labels, logic formula fragments and constraints out of a big conjunction of such as produced by the hole semantics grammar. It then provides some operations on the semantics dealing with holes, labels and finding legal ways to plug holes with labels.
nltk.sem.lfg module¶
nltk.sem.linearlogic module¶
-
class
nltk.sem.linearlogic.
ApplicationExpression
(function, argument, argument_indices=None)[source]¶ Bases:
nltk.sem.linearlogic.Expression
-
simplify
(bindings=None)[source]¶ Since function is an implication, return its consequent. There should be no need to check that the application is valid since the checking is done by the constructor.
Parameters: bindings – BindingDict
A dictionary of bindings used to simplifyReturns: Expression
-
unicode_repr
()¶
-
-
class
nltk.sem.linearlogic.
AtomicExpression
(name, dependencies=None)[source]¶ Bases:
nltk.sem.linearlogic.Expression
-
compile_neg
(index_counter, glueFormulaFactory)[source]¶ From Iddo Lev’s PhD Dissertation p108-109
Parameters: - index_counter –
Counter
for unique indices - glueFormulaFactory –
GlueFormula
for creating new glue formulas
Returns: (
Expression
,set) for the compiled linear logic and any newly created glue formulas- index_counter –
-
compile_pos
(index_counter, glueFormulaFactory)[source]¶ From Iddo Lev’s PhD Dissertation p108-109
Parameters: - index_counter –
Counter
for unique indices - glueFormulaFactory –
GlueFormula
for creating new glue formulas
Returns: (
Expression
,set) for the compiled linear logic and any newly created glue formulas- index_counter –
-
simplify
(bindings=None)[source]¶ If ‘self’ is bound by ‘bindings’, return the atomic to which it is bound. Otherwise, return self.
Parameters: bindings – BindingDict
A dictionary of bindings used to simplifyReturns: AtomicExpression
-
unicode_repr
()¶
-
-
class
nltk.sem.linearlogic.
ConstantExpression
(name, dependencies=None)[source]¶ Bases:
nltk.sem.linearlogic.AtomicExpression
-
unify
(other, bindings)[source]¶ If ‘other’ is a constant, then it must be equal to ‘self’. If ‘other’ is a variable, then it must not be bound to anything other than ‘self’.
Parameters: - other –
Expression
- bindings –
BindingDict
A dictionary of all current bindings
Returns: BindingDict
A new combined dictionary of of ‘bindings’ and any new bindingRaises: UnificationException – If ‘self’ and ‘other’ cannot be unified in the context of ‘bindings’
- other –
-
-
class
nltk.sem.linearlogic.
ImpExpression
(antecedent, consequent)[source]¶ Bases:
nltk.sem.linearlogic.Expression
-
compile_neg
(index_counter, glueFormulaFactory)[source]¶ From Iddo Lev’s PhD Dissertation p108-109
Parameters: - index_counter –
Counter
for unique indices - glueFormulaFactory –
GlueFormula
for creating new glue formulas
Returns: (
Expression
,list ofGlueFormula
) for the compiled linear logic and any newly created glue formulas- index_counter –
-
compile_pos
(index_counter, glueFormulaFactory)[source]¶ From Iddo Lev’s PhD Dissertation p108-109
Parameters: - index_counter –
Counter
for unique indices - glueFormulaFactory –
GlueFormula
for creating new glue formulas
Returns: (
Expression
,set) for the compiled linear logic and any newly created glue formulas- index_counter –
-
unicode_repr
()¶
-
unify
(other, bindings)[source]¶ Both the antecedent and consequent of ‘self’ and ‘other’ must unify.
Parameters: - other –
ImpExpression
- bindings –
BindingDict
A dictionary of all current bindings
Returns: BindingDict
A new combined dictionary of of ‘bindings’ and any new bindingsRaises: UnificationException – If ‘self’ and ‘other’ cannot be unified in the context of ‘bindings’
- other –
-
-
class
nltk.sem.linearlogic.
LinearLogicParser
[source]¶ Bases:
nltk.sem.logic.LogicParser
A linear logic expression parser.
-
class
nltk.sem.linearlogic.
Tokens
[source]¶ Bases:
object
-
CLOSE
= ')'¶
-
IMP
= '-o'¶
-
OPEN
= '('¶
-
PUNCT
= ['(', ')']¶
-
TOKENS
= ['(', ')', '-o']¶
-
-
class
nltk.sem.linearlogic.
VariableExpression
(name, dependencies=None)[source]¶ Bases:
nltk.sem.linearlogic.AtomicExpression
-
unify
(other, bindings)[source]¶ ‘self’ must not be bound to anything other than ‘other’.
Parameters: - other –
Expression
- bindings –
BindingDict
A dictionary of all current bindings
Returns: BindingDict
A new combined dictionary of of ‘bindings’ and the new bindingRaises: UnificationException – If ‘self’ and ‘other’ cannot be unified in the context of ‘bindings’
- other –
-
nltk.sem.logic module¶
A version of first order predicate logic, built on top of the typed lambda calculus.
-
class
nltk.sem.logic.
AbstractVariableExpression
(variable)[source]¶ Bases:
nltk.sem.logic.Expression
This class represents a variable to be used as a predicate or entity
-
replace
(variable, expression, replace_bound=False, alpha_convert=True)[source]¶ See: Expression.replace()
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
AndExpression
(first, second)[source]¶ Bases:
nltk.sem.logic.BooleanExpression
This class represents conjunctions
-
class
nltk.sem.logic.
AnyType
[source]¶ Bases:
nltk.sem.logic.BasicType
,nltk.sem.logic.ComplexType
-
first
¶
-
second
¶
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
ApplicationExpression
(function, argument)[source]¶ Bases:
nltk.sem.logic.Expression
This class is used to represent two related types of logical expressions.
The first is a Predicate Expression, such as “P(x,y)”. A predicate expression is comprised of a
FunctionVariableExpression
orConstantExpression
as the predicate and a list of Expressions as the arguments.The second is a an application of one expression to another, such as “(x.dog(x))(fido)”.
The reason Predicate Expressions are treated as Application Expressions is that the Variable Expression predicate of the expression may be replaced with another Expression, such as a LambdaExpression, which would mean that the Predicate should be thought of as being applied to the arguments.
The logical expression reader will always curry arguments in a application expression. So, “x y.see(x,y)(john,mary)” will be represented internally as “((x y.(see(x))(y))(john))(mary)”. This simplifies the internals since there will always be exactly one argument in an application.
The str() method will usually print the curried forms of application expressions. The one exception is when the the application expression is really a predicate expression (ie, underlying function is an
AbstractVariableExpression
). This means that the example from above will be returned as “(x y.see(x,y)(john))(mary)”.-
args
¶ Return uncurried arg-list
-
is_atom
()[source]¶ Is this expression an atom (as opposed to a lambda expression applied to a term)?
-
pred
¶ Return uncurried base-function. If this is an atom, then the result will be a variable expression. Otherwise, it will be a lambda expression.
-
type
¶
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
BasicType
[source]¶ Bases:
nltk.sem.logic.Type
-
class
nltk.sem.logic.
BinaryExpression
(first, second)[source]¶ Bases:
nltk.sem.logic.Expression
-
type
¶
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
ComplexType
(first, second)[source]¶ Bases:
nltk.sem.logic.Type
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
ConstantExpression
(variable)[source]¶ Bases:
nltk.sem.logic.AbstractVariableExpression
This class represents variables that do not take the form of a single character followed by zero or more digits.
-
type
= e¶
-
-
class
nltk.sem.logic.
EntityType
[source]¶ Bases:
nltk.sem.logic.BasicType
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
EqualityExpression
(first, second)[source]¶ Bases:
nltk.sem.logic.BinaryExpression
This class represents equality expressions like “(x = y)”.
-
class
nltk.sem.logic.
EventType
[source]¶ Bases:
nltk.sem.logic.BasicType
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
EventVariableExpression
(variable)[source]¶ Bases:
nltk.sem.logic.IndividualVariableExpression
This class represents variables that take the form of a single lowercase ‘e’ character followed by zero or more digits.
-
type
= v¶
-
-
class
nltk.sem.logic.
Expression
[source]¶ Bases:
nltk.sem.logic.SubstituteBindingsI
This is the base abstract object for all logical expressions
-
constants
()[source]¶ Return a set of individual constants (non-predicates). :return: set of
Variable
objects
-
equiv
(other, prover=None)[source]¶ Check for logical equivalence. Pass the expression (self <-> other) to the theorem prover. If the prover says it is valid, then the self and other are equal.
Parameters: - other – an
Expression
to check equality against - prover – a
nltk.inference.api.Prover
- other – an
-
findtype
(variable)[source]¶ Find the type of the given variable as it is used in this expression. For example, finding the type of “P” in “P(x) & Q(x,y)” yields “<e,t>”
Parameters: variable – Variable
-
free
()[source]¶ Return a set of all the free (non-bound) variables. This includes both individual and predicate variables, but not constants. :return: set of
Variable
objects
-
predicates
()[source]¶ Return a set of predicates (constants, not variables). :return: set of
Variable
objects
-
replace
(variable, expression, replace_bound=False, alpha_convert=True)[source]¶ Replace every instance of ‘variable’ with ‘expression’ :param variable:
Variable
The variable to replace :param expression:Expression
The expression with which to replace it :param replace_bound: bool Should bound variables be replaced? :param alpha_convert: bool Alpha convert automatically to avoid name clashes?
-
typecheck
(signature=None)[source]¶ Infer and check types. Raise exceptions if necessary.
Parameters: signature – dict that maps variable names to types (or string representations of types) Returns: the signature, plus any additional type mappings
-
unicode_repr
()¶
-
variables
()[source]¶ Return a set of all the variables for binding substitution. The variables returned include all free (non-bound) individual variables and any variable starting with ‘?’ or ‘@’. :return: set of
Variable
objects
-
visit
(function, combinator)[source]¶ Recursively visit subexpressions. Apply ‘function’ to each subexpression and pass the result of each function application to the ‘combinator’ for aggregation:
return combinator(map(function, self.subexpressions))Bound variables are neither applied upon by the function nor given to the combinator. :param function:
Function<Expression,T>
to call on each subexpression :param combinator:Function<list<T>,R>
to combine the results of the function calls :return: result of combinationR
-
visit_structured
(function, combinator)[source]¶ Recursively visit subexpressions. Apply ‘function’ to each subexpression and pass the result of each function application to the ‘combinator’ for aggregation. The combinator must have the same signature as the constructor. The function is not applied to bound variables, but they are passed to the combinator. :param function:
Function
to call on each subexpression :param combinator:Function
with the same signature as the constructor, to combine the results of the function calls :return: result of combination
-
-
class
nltk.sem.logic.
FunctionVariableExpression
(variable)[source]¶ Bases:
nltk.sem.logic.AbstractVariableExpression
This class represents variables that take the form of a single uppercase character followed by zero or more digits.
-
type
= ?¶
-
-
class
nltk.sem.logic.
IffExpression
(first, second)[source]¶ Bases:
nltk.sem.logic.BooleanExpression
This class represents biconditionals
-
exception
nltk.sem.logic.
IllegalTypeException
(expression, other_type, allowed_type)[source]¶ Bases:
nltk.sem.logic.TypeException
-
class
nltk.sem.logic.
ImpExpression
(first, second)[source]¶ Bases:
nltk.sem.logic.BooleanExpression
This class represents implications
-
exception
nltk.sem.logic.
InconsistentTypeHierarchyException
(variable, expression=None)[source]¶ Bases:
nltk.sem.logic.TypeException
-
class
nltk.sem.logic.
IndividualVariableExpression
(variable)[source]¶ Bases:
nltk.sem.logic.AbstractVariableExpression
This class represents variables that take the form of a single lowercase character (other than ‘e’) followed by zero or more digits.
-
type
¶
-
-
class
nltk.sem.logic.
LambdaExpression
(variable, term)[source]¶ Bases:
nltk.sem.logic.VariableBinderExpression
-
type
¶
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
LogicParser
(type_check=False)[source]¶ Bases:
object
A lambda calculus expression parser.
-
attempt_ApplicationExpression
(expression, context)[source]¶ Attempt to make an application expression. The next tokens are a list of arguments in parens, then the argument expression is a function being applied to the arguments. Otherwise, return the argument expression.
-
attempt_BooleanExpression
(expression, context)[source]¶ Attempt to make a boolean expression. If the next token is a boolean operator, then a BooleanExpression will be returned. Otherwise, the parameter will be returned.
-
attempt_EqualityExpression
(expression, context)[source]¶ Attempt to make an equality expression. If the next token is an equality operator, then an EqualityExpression will be returned. Otherwise, the parameter will be returned.
-
get_BooleanExpression_factory
(tok)[source]¶ This method serves as a hook for other logic parsers that have different boolean operators
-
get_QuantifiedExpression_factory
(tok)[source]¶ This method serves as a hook for other logic parsers that have different quantifiers
-
handle
(tok, context)[source]¶ This method is intended to be overridden for logics that use different operators or expressions
-
make_EqualityExpression
(first, second)[source]¶ This method serves as a hook for other logic parsers that have different equality expression classes
-
parse
(data, signature=None)[source]¶ Parse the expression.
Parameters: - data – str for the input to be parsed
- signature –
dict<str, str>
that maps variable names to type
strings :returns: a parsed Expression
-
process_next_expression
(context)[source]¶ Parse the next complete expression from the stream and return it.
-
token
(location=None)[source]¶ Get the next waiting token. If a location is given, then return the token at currentIndex+location without advancing currentIndex; setting it gives lookahead/lookback capability.
-
type_check
= None¶ A list of tuples of quote characters. The 4-tuple is comprised of the start character, the end character, the escape character, and a boolean indicating whether the quotes should be included in the result. Quotes are used to signify that a token should be treated as atomic, ignoring any special characters within the token. The escape character allows the quote end character to be used within the quote. If True, the boolean indicates that the final token should contain the quote and escape characters. This method exists to be overridden
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
NegatedExpression
(term)[source]¶ Bases:
nltk.sem.logic.Expression
-
type
¶
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
OrExpression
(first, second)[source]¶ Bases:
nltk.sem.logic.BooleanExpression
This class represents disjunctions
-
class
nltk.sem.logic.
QuantifiedExpression
(variable, term)[source]¶ Bases:
nltk.sem.logic.VariableBinderExpression
-
type
¶
-
unicode_repr
()¶
-
-
class
nltk.sem.logic.
SubstituteBindingsI
[source]¶ Bases:
object
An interface for classes that can perform substitutions for variables.
-
class
nltk.sem.logic.
Tokens
[source]¶ Bases:
object
-
ALL
= 'all'¶
-
ALL_LIST
= ['all', 'forall']¶
-
AND
= '&'¶
-
AND_LIST
= ['and', '&', '^']¶
-
BINOPS
= ['and', '&', '^', 'or', '|', 'implies', '->', '=>', 'iff', '<->', '<=>']¶
-
CLOSE
= ')'¶
-
COMMA
= ','¶
-
DOT
= '.'¶
-
EQ
= '='¶
-
EQ_LIST
= ['=', '==']¶
-
EXISTS
= 'exists'¶
-
EXISTS_LIST
= ['some', 'exists', 'exist']¶
-
IFF
= '<->'¶
-
IFF_LIST
= ['iff', '<->', '<=>']¶
-
IMP
= '->'¶
-
IMP_LIST
= ['implies', '->', '=>']¶
-
LAMBDA
= '\\'¶
-
LAMBDA_LIST
= ['\\']¶
-
NEQ
= '!='¶
-
NEQ_LIST
= ['!=']¶
-
NOT
= '-'¶
-
NOT_LIST
= ['not', '-', '!']¶
-
OPEN
= '('¶
-
OR
= '|'¶
-
OR_LIST
= ['or', '|']¶
-
PUNCT
= ['.', '(', ')', ',']¶
-
QUANTS
= ['some', 'exists', 'exist', 'all', 'forall']¶
-
SYMBOLS
= ['&', '^', '|', '->', '=>', '<->', '<=>', '=', '==', '!=', '\\', '.', '(', ')', ',', '-', '!']¶
-
TOKENS
= ['and', '&', '^', 'or', '|', 'implies', '->', '=>', 'iff', '<->', '<=>', '=', '==', '!=', 'some', 'exists', 'exist', 'all', 'forall', '\\', '.', '(', ')', ',', 'not', '-', '!']¶
-
-
class
nltk.sem.logic.
TruthValueType
[source]¶ Bases:
nltk.sem.logic.BasicType
-
unicode_repr
()¶
-
-
exception
nltk.sem.logic.
TypeResolutionException
(expression, other_type)[source]¶ Bases:
nltk.sem.logic.TypeException
-
exception
nltk.sem.logic.
UnexpectedTokenException
(index, unexpected=None, expected=None, message=None)[source]¶
-
class
nltk.sem.logic.
VariableBinderExpression
(variable, term)[source]¶ Bases:
nltk.sem.logic.Expression
This an abstract class for any Expression that binds a variable in an Expression. This includes LambdaExpressions and Quantified Expressions
-
alpha_convert
(newvar)[source]¶ Rename all occurrences of the variable introduced by this variable binder in the expression to
newvar
. :param newvar:Variable
, for the new variable
-
-
nltk.sem.logic.
VariableExpression
(variable)[source]¶ This is a factory method that instantiates and returns a subtype of
AbstractVariableExpression
appropriate for the given variable.
-
nltk.sem.logic.
is_eventvar
(expr)[source]¶ An event variable must be a single lowercase ‘e’ character followed by zero or more digits.
Parameters: expr – str Returns: bool True if expr is of the correct form
-
nltk.sem.logic.
is_funcvar
(expr)[source]¶ A function variable must be a single uppercase character followed by zero or more digits.
Parameters: expr – str Returns: bool True if expr is of the correct form
-
nltk.sem.logic.
is_indvar
(expr)[source]¶ An individual variable must be a single lowercase character other than ‘e’, followed by zero or more digits.
Parameters: expr – str Returns: bool True if expr is of the correct form
-
nltk.sem.logic.
read_logic
(s, logic_parser=None, encoding=None)[source]¶ Convert a file of First Order Formulas into a list of {Expression}s.
Parameters: - s (str) – the contents of the file
- logic_parser (LogicParser) – The parser to be used to parse the logical expression
- encoding (str) – the encoding of the input string, if it is binary
Returns: a list of parsed formulas.
Return type:
-
nltk.sem.logic.
skolem_function
(univ_scope=None)[source]¶ Return a skolem function over the variables in univ_scope param univ_scope
nltk.sem.relextract module¶
Code for extracting relational triples from the ieer and conll2002 corpora.
Relations are stored internally as dictionaries (‘reldicts’).
The two serialization outputs are “rtuple” and “clause”.
- An rtuple is a tuple of the form
(subj, filler, obj)
, wheresubj
andobj
are pairs of Named Entity mentions, andfiller
is the string of words occurring betweensub
andobj
(with no intervening NEs). Strings are printed viarepr()
to circumvent locale variations in rendering utf-8 encoded strings. - A clause is an atom of the form
relsym(subjsym, objsym)
, where the relation, subject and object have been canonicalized to single strings.
-
nltk.sem.relextract.
class_abbrev
(type)[source]¶ Abbreviate an NE class name. :type type: str :rtype: str
-
nltk.sem.relextract.
clause
(reldict, relsym)[source]¶ Print the relation in clausal form. :param reldict: a relation dictionary :type reldict: defaultdict :param relsym: a label for the relation :type relsym: str
-
nltk.sem.relextract.
conllned
(trace=1)[source]¶ Find the copula+’van’ relation (‘of’) in the Dutch tagged training corpus from CoNLL 2002.
-
nltk.sem.relextract.
descape_entity
(m, defs={'bull': '•', 'spades': '♠', 'xi': 'ξ', 'AElig': 'Æ', 'ge': '≥', 'rlm': '\u200f', 'beta': 'β', 'Eta': 'Η', 'Eacute': 'É', 'Rho': 'Ρ', 'Aring': 'Å', 'Nu': 'Ν', 'Xi': 'Ξ', 'ndash': '–', 'lsquo': '‘', 'ang': '∠', 'Oacute': 'Ó', 'atilde': 'ã', 'larr': '←', 'part': '∂', 'zwj': '\u200d', 'prop': '∝', 'ograve': 'ò', 'sdot': '⋅', 'aelig': 'æ', 'egrave': 'è', 'Pi': 'Π', 'Iacute': 'Í', 'diams': '♦', 'delta': 'δ', 'ccedil': 'ç', 'gt': '>', 'iuml': 'ï', 'darr': '↓', 'sup3': '³', 'sigmaf': 'ς', 'Uuml': 'Ü', 'Ntilde': 'Ñ', 'permil': '‰', 'Ugrave': 'Ù', 'bdquo': '„', 'cedil': '¸', 'Acirc': 'Â', 'iquest': '¿', 'image': 'ℑ', 'OElig': 'Œ', 'rfloor': '⌋', 'iexcl': '¡', 'and': '∧', 'hellip': '…', 'uml': '¨', 'ni': '∋', 'plusmn': '±', 'nabla': '∇', 'amp': '&', 'ne': '≠', 'minus': '−', 'lang': '〈', 'rdquo': '”', 'Omicron': 'Ο', 'Aacute': 'Á', 'shy': '\xad', 'ETH': 'Ð', 'otimes': '⊗', 'scaron': 'š', 'there4': '∴', 'pound': '£', 'Ouml': 'Ö', 'rsaquo': '›', 'raquo': '»', 'lArr': '⇐', 'lowast': '∗', 'ldquo': '“', 'Prime': '″', 'Theta': 'Θ', 'lsaquo': '‹', 'yacute': 'ý', 'Yuml': 'Ÿ', 'Ecirc': 'Ê', 'Lambda': 'Λ', 'Gamma': 'Γ', 'mdash': '—', 'Oslash': 'Ø', 'Igrave': 'Ì', 'fnof': 'ƒ', 'uuml': 'ü', 'Scaron': 'Š', 'supe': '⊇', 'Yacute': 'Ý', 'laquo': '«', 'micro': 'µ', 'epsilon': 'ε', 'rceil': '⌉', 'circ': 'ˆ', 'icirc': 'î', 'exist': '∃', 'ocirc': 'ô', 'Upsilon': 'Υ', 'prod': '∏', 'lfloor': '⌊', 'uarr': '↑', 'ntilde': 'ñ', 'oelig': 'œ', 'Auml': 'Ä', 'acute': '´', 'hearts': '♥', 'euro': '€', 'piv': 'ϖ', 'iacute': 'í', 'infin': '∞', 'cong': '≅', 'asymp': '≈', 'lt': '<', 'int': '∫', 'times': '×', 'nsub': '⊄', 'Icirc': 'Î', 'cap': '∩', 'sup': '⊃', 'prime': '′', 'Uacute': 'Ú', 'Epsilon': 'Ε', 'weierp': '℘', 'Phi': 'Φ', 'Ograve': 'Ò', 'kappa': 'κ', 'Tau': 'Τ', 'pi': 'π', 'szlig': 'ß', 'tau': 'τ', 'mu': 'μ', 'ecirc': 'ê', 'agrave': 'à', 'eacute': 'é', 'quot': '"', 'le': '≤', 'nbsp': '\xa0', 'forall': '∀', 'Chi': 'Χ', 'yuml': 'ÿ', 'emsp': '\u2003', 'perp': '⊥', 'Kappa': 'Κ', 'lrm': '\u200e', 'cup': '∪', 'upsilon': 'υ', 'dArr': '⇓', 'Dagger': '‡', 'chi': 'χ', 'Ccedil': 'Ç', 'rho': 'ρ', 'igrave': 'ì', 'auml': 'ä', 'phi': 'φ', 'deg': '°', 'Mu': 'Μ', 'reg': '®', 'THORN': 'Þ', 'frasl': '⁄', 'Iota': 'Ι', 'sum': '∑', 'frac12': '½', 'zwnj': '\u200c', 'zeta': 'ζ', 'oplus': '⊕', 'ensp': '\u2002', 'rang': '〉', 'hArr': '⇔', 'sigma': 'σ', 'Sigma': 'Σ', 'Otilde': 'Õ', 'Atilde': 'Ã', 'para': '¶', 'trade': '™', 'rarr': '→', 'frac14': '¼', 'sbquo': '‚', 'Alpha': 'Α', 'sim': '∼', 'not': '¬', 'eth': 'ð', 'ordf': 'ª', 'ordm': 'º', 'sup2': '²', 'rArr': '⇒', 'Agrave': 'À', 'aring': 'å', 'macr': '¯', 'empty': '∅', 'oline': '‾', 'sect': '§', 'lceil': '⌈', 'aacute': 'á', 'acirc': 'â', 'tilde': '˜', 'rsquo': '’', 'sub': '⊂', 'Delta': 'Δ', 'cent': '¢', 'divide': '÷', 'middot': '·', 'ucirc': 'û', 'equiv': '≡', 'upsih': 'ϒ', 'ouml': 'ö', 'or': '∨', 'yen': '¥', 'crarr': '↵', 'nu': 'ν', 'euml': 'ë', 'psi': 'ψ', 'omicron': 'ο', 'Psi': 'Ψ', 'real': 'ℜ', 'dagger': '†', 'copy': '©', 'omega': 'ω', 'gamma': 'γ', 'oslash': 'ø', 'oacute': 'ó', 'sube': '⊆', 'alpha': 'α', 'Egrave': 'È', 'thetasym': 'ϑ', 'ugrave': 'ù', 'Zeta': 'Ζ', 'thinsp': '\u2009', 'Iuml': 'Ï', 'Beta': 'Β', 'uacute': 'ú', 'eta': 'η', 'curren': '¤', 'frac34': '¾', 'Ocirc': 'Ô', 'brvbar': '¦', 'Omega': 'Ω', 'clubs': '♣', 'loz': '◊', 'theta': 'θ', 'Ucirc': 'Û', 'alefsym': 'ℵ', 'sup1': '¹', 'thorn': 'þ', 'radic': '√', 'iota': 'ι', 'uArr': '⇑', 'harr': '↔', 'isin': '∈', 'Euml': 'Ë', 'otilde': 'õ', 'lambda': 'λ', 'notin': '∉'})[source]¶ Translate one entity to its ISO Latin value. Inspired by example from effbot.org
-
nltk.sem.relextract.
extract_rels
(subjclass, objclass, doc, corpus='ace', pattern=None, window=10)[source]¶ Filter the output of
semi_rel2reldict
according to specified NE classes and a filler pattern.The parameters
subjclass
andobjclass
can be used to restrict the Named Entities to particular types (any of ‘LOCATION’, ‘ORGANIZATION’, ‘PERSON’, ‘DURATION’, ‘DATE’, ‘CARDINAL’, ‘PERCENT’, ‘MONEY’, ‘MEASURE’).Parameters: - subjclass (str) – the class of the subject Named Entity.
- objclass (str) – the class of the object Named Entity.
- doc (ieer document or a list of chunk trees) – input document
- corpus (str) – name of the corpus to take as input; possible values are ‘ieer’ and ‘conll2002’
- pattern (SRE_Pattern) – a regular expression for filtering the fillers of retrieved triples.
- window (int) – filters out fillers which exceed this threshold
Returns: see
mk_reldicts
Return type: list(defaultdict)
-
nltk.sem.relextract.
in_demo
(trace=0, sql=True)[source]¶ Select pairs of organizations and locations whose mentions occur with an intervening occurrence of the preposition “in”.
If the sql parameter is set to True, then the entity pairs are loaded into an in-memory database, and subsequently pulled out using an SQL “SELECT” query.
-
nltk.sem.relextract.
list2sym
(lst)[source]¶ Convert a list of strings into a canonical symbol. :type lst: list :return: a Unicode string without whitespace :rtype: unicode
-
nltk.sem.relextract.
rtuple
(reldict, lcon=False, rcon=False)[source]¶ Pretty print the reldict as an rtuple. :param reldict: a relation dictionary :type reldict: defaultdict
-
nltk.sem.relextract.
semi_rel2reldict
(pairs, window=5, trace=False)[source]¶ Converts the pairs generated by
tree2semi_rel
into a ‘reldict’: a dictionary which stores information about the subject and object NEs plus the filler between them. Additionally, a left and right context of length =< window are captured (within a given input sentence).Parameters: - pairs – a pair of list(str) and
Tree
, as generated by - window (int) – a threshold for the number of items to include in the left and right context
Returns: ‘relation’ dictionaries whose keys are ‘lcon’, ‘subjclass’, ‘subjtext’, ‘subjsym’, ‘filler’, objclass’, objtext’, ‘objsym’ and ‘rcon’
Return type: list(defaultdict)
- pairs – a pair of list(str) and
-
nltk.sem.relextract.
tree2semi_rel
(tree)[source]¶ Group a chunk structure into a list of ‘semi-relations’ of the form (list(str),
Tree
).In order to facilitate the construction of (
Tree
, string,Tree
) triples, this identifies pairs whose first member is a list (possibly empty) of terminal strings, and whose second member is aTree
of the form (NE_label, terminals).Parameters: tree – a chunk tree Returns: a list of pairs (list(str), Tree
)Return type: list of tuple
nltk.sem.skolemize module¶
nltk.sem.util module¶
Utility functions for batch-processing sentences: parsing and extraction of the semantic representation of the root node of the the syntax tree, followed by evaluation of the semantic representation in a first-order model.
-
nltk.sem.util.
demo_legacy_grammar
()[source]¶ Check that interpret_sents() is compatible with legacy grammars that use a lowercase ‘sem’ feature.
Define ‘test.fcfg’ to be the following
-
nltk.sem.util.
evaluate_sents
(inputs, grammar, model, assignment, trace=0)[source]¶ Add the truth-in-a-model value to each semantic representation for each syntactic parse of each input sentences.
Parameters: Returns: a mapping from sentences to lists of triples (parse-tree, semantic-representations, evaluation-in-model)
Return type: list(list(tuple(nltk.tree.Tree, nltk.sem.logic.ConstantExpression, bool or dict(str): bool)))
-
nltk.sem.util.
interpret_sents
(inputs, grammar, semkey='SEM', trace=0)[source]¶ Add the semantic representation to each syntactic parse tree of each input sentence.
Parameters: Returns: a mapping from sentences to lists of pairs (parse-tree, semantic-representations)
Return type: list(list(tuple(nltk.tree.Tree, nltk.sem.logic.ConstantExpression)))
-
nltk.sem.util.
parse_sents
(inputs, grammar, trace=0)[source]¶ Convert input sentences into syntactic trees.
Parameters: Return type: Returns: a mapping from input sentences to a list of ``Tree``s
-
nltk.sem.util.
root_semrep
(syntree, semkey='SEM')[source]¶ Find the semantic representation at the root of a tree.
Parameters: - syntree – a parse
Tree
- semkey – the feature label to use for the root semantics in the tree
Returns: the semantic representation at the root of a
Tree
Return type: sem.Expression
- syntree – a parse
Module contents¶
NLTK Semantic Interpretation Package
This package contains classes for representing semantic structure in formulas of first-order logic and for evaluating such formulas in set-theoretic models.
>>> from nltk.sem import logic
>>> logic._counter._value = 0
The package has two main components:
logic
provides support for analyzing expressions of First Order Logic (FOL).evaluate
allows users to recursively determine truth in a model for formulas of FOL.
A model consists of a domain of discourse and a valuation function,
which assigns values to non-logical constants. We assume that entities
in the domain are represented as strings such as 'b1'
, 'g1'
,
etc. A Valuation
is initialized with a list of (symbol, value)
pairs, where values are entities, sets of entities or sets of tuples
of entities.
The domain of discourse can be inferred from the valuation, and model
is then created with domain and valuation as parameters.
>>> from nltk.sem import Valuation, Model
>>> v = [('adam', 'b1'), ('betty', 'g1'), ('fido', 'd1'),
... ('girl', set(['g1', 'g2'])), ('boy', set(['b1', 'b2'])),
... ('dog', set(['d1'])),
... ('love', set([('b1', 'g1'), ('b2', 'g2'), ('g1', 'b1'), ('g2', 'b1')]))]
>>> val = Valuation(v)
>>> dom = val.domain
>>> m = Model(dom, val)