nltk.sem package

Submodules

nltk.sem.boxer module

An interface to Boxer.

This interface relies on the latest version of the development (subversion) version of C&C and Boxer.

Usage:

Set the environment variable CANDC to the bin directory of your CandC installation. The models directory should be in the CandC root directory. For example:

/path/to/candc/
bin/
candc boxer
models/
boxer/
class nltk.sem.boxer.AbstractBoxerDrs[source]

Bases: object

atoms()[source]
clean()[source]
renumber_sentences(f)[source]
variable_types()[source]
variables()[source]
Returns:(set<variables>, set<events>, set<propositions>)
class nltk.sem.boxer.Boxer(boxer_drs_interpreter=None, elimeq=False, bin_dir=None, verbose=False, resolve=True)[source]

Bases: object

This class is an interface to Johan Bos’s program Boxer, a wide-coverage semantic parser that produces Discourse Representation Structures (DRSs).

interpret(input, discourse_id=None, question=False, verbose=False)[source]

Use Boxer to give a first order representation.

Parameters:
  • input – str Input sentence to parse
  • occur_index – bool Should predicates be occurrence indexed?
  • discourse_id – str An identifier to be inserted to each occurrence-indexed predicate.
Returns:

drt.DrtExpression

interpret_multi(input, discourse_id=None, question=False, verbose=False)[source]

Use Boxer to give a first order representation.

Parameters:
  • input – list of str Input sentences to parse as a single discourse
  • occur_index – bool Should predicates be occurrence indexed?
  • discourse_id – str An identifier to be inserted to each occurrence-indexed predicate.
Returns:

drt.DrtExpression

interpret_multi_sents(inputs, discourse_ids=None, question=False, verbose=False)[source]

Use Boxer to give a first order representation.

Parameters:
  • inputs – list of list of str Input discourses to parse
  • occur_index – bool Should predicates be occurrence indexed?
  • discourse_ids – list of str Identifiers to be inserted to each occurrence-indexed predicate.
Returns:

drt.DrtExpression

interpret_sents(inputs, discourse_ids=None, question=False, verbose=False)[source]

Use Boxer to give a first order representation.

Parameters:
  • inputs – list of str Input sentences to parse as individual discourses
  • occur_index – bool Should predicates be occurrence indexed?
  • discourse_ids – list of str Identifiers to be inserted to each occurrence-indexed predicate.
Returns:

list of drt.DrtExpression

set_bin_dir(bin_dir, verbose=False)[source]
class nltk.sem.boxer.BoxerCard(discourse_id, sent_index, word_indices, var, value, type)[source]

Bases: nltk.sem.boxer.BoxerIndexed

renumber_sentences(f)[source]
class nltk.sem.boxer.BoxerDrs(refs, conds, consequent=None)[source]

Bases: nltk.sem.boxer.AbstractBoxerDrs

atoms()[source]
clean()[source]
renumber_sentences(f)[source]
unicode_repr()
class nltk.sem.boxer.BoxerDrsParser(discourse_id=None)[source]

Bases: nltk.sem.drt.DrtParser

Reparse the str form of subclasses of AbstractBoxerDrs

attempt_adjuncts(expression, context)[source]
get_all_symbols()[source]
get_next_token_variable(description)[source]
handle(tok, context)[source]
nullableIntToken()[source]
class nltk.sem.boxer.BoxerEq(discourse_id, sent_index, word_indices, var1, var2)[source]

Bases: nltk.sem.boxer.BoxerIndexed

atoms()[source]
renumber_sentences(f)[source]
class nltk.sem.boxer.BoxerIndexed(discourse_id, sent_index, word_indices)[source]

Bases: nltk.sem.boxer.AbstractBoxerDrs

atoms()[source]
unicode_repr()
class nltk.sem.boxer.BoxerNamed(discourse_id, sent_index, word_indices, var, name, type, sense)[source]

Bases: nltk.sem.boxer.BoxerIndexed

change_var(var)[source]
clean()[source]
renumber_sentences(f)[source]
class nltk.sem.boxer.BoxerNot(drs)[source]

Bases: nltk.sem.boxer.AbstractBoxerDrs

atoms()[source]
clean()[source]
renumber_sentences(f)[source]
unicode_repr()
class nltk.sem.boxer.BoxerOr(discourse_id, sent_index, word_indices, drs1, drs2)[source]

Bases: nltk.sem.boxer.BoxerIndexed

atoms()[source]
clean()[source]
renumber_sentences(f)[source]
class nltk.sem.boxer.BoxerOutputDrsParser(discourse_id=None)[source]

Bases: nltk.sem.drt.DrtParser

attempt_adjuncts(expression, context)[source]
get_all_symbols()[source]
handle(tok, context)[source]
handle_condition(tok, indices)[source]

Handle a DRS condition

Parameters:indices – list of int
Returns:list of DrtExpression
handle_drs(tok)[source]
parse(data, signature=None)[source]
parse_condition(indices)[source]

Parse a DRS condition

Returns:list of DrtExpression
parse_drs()[source]
parse_index()[source]
parse_variable()[source]
class nltk.sem.boxer.BoxerPred(discourse_id, sent_index, word_indices, var, name, pos, sense)[source]

Bases: nltk.sem.boxer.BoxerIndexed

change_var(var)[source]
clean()[source]
renumber_sentences(f)[source]
class nltk.sem.boxer.BoxerProp(discourse_id, sent_index, word_indices, var, drs)[source]

Bases: nltk.sem.boxer.BoxerIndexed

atoms()[source]
clean()[source]
referenced_labels()[source]
renumber_sentences(f)[source]
class nltk.sem.boxer.BoxerRel(discourse_id, sent_index, word_indices, var1, var2, rel, sense)[source]

Bases: nltk.sem.boxer.BoxerIndexed

clean()[source]
renumber_sentences(f)[source]
class nltk.sem.boxer.BoxerWhq(discourse_id, sent_index, word_indices, ans_types, drs1, variable, drs2)[source]

Bases: nltk.sem.boxer.BoxerIndexed

atoms()[source]
clean()[source]
renumber_sentences(f)[source]
class nltk.sem.boxer.NltkDrtBoxerDrsInterpreter(occur_index=False)[source]

Bases: object

interpret(ex)[source]
Parameters:exAbstractBoxerDrs
Returns:DrtExpression
class nltk.sem.boxer.PassthroughBoxerDrsInterpreter[source]

Bases: object

interpret(ex)[source]
exception nltk.sem.boxer.UnparseableInputException[source]

Bases: Exception

nltk.sem.chat80 module

Overview

Chat-80 was a natural language system which allowed the user to interrogate a Prolog knowledge base in the domain of world geography. It was developed in the early ‘80s by Warren and Pereira; see http://www.aclweb.org/anthology/J82-3002.pdf for a description and http://www.cis.upenn.edu/~pereira/oldies.html for the source files.

This module contains functions to extract data from the Chat-80 relation files (‘the world database’), and convert then into a format that can be incorporated in the FOL models of nltk.sem.evaluate. The code assumes that the Prolog input files are available in the NLTK corpora directory.

The Chat-80 World Database consists of the following files:

world0.pl
rivers.pl
cities.pl
countries.pl
contain.pl
borders.pl

This module uses a slightly modified version of world0.pl, in which a set of Prolog rules have been omitted. The modified file is named world1.pl. Currently, the file rivers.pl is not read in, since it uses a list rather than a string in the second field.

Reading Chat-80 Files

Chat-80 relations are like tables in a relational database. The relation acts as the name of the table; the first argument acts as the ‘primary key’; and subsequent arguments are further fields in the table. In general, the name of the table provides a label for a unary predicate whose extension is all the primary keys. For example, relations in cities.pl are of the following form:

'city(athens,greece,1368).'

Here, 'athens' is the key, and will be mapped to a member of the unary predicate city.

The fields in the table are mapped to binary predicates. The first argument of the predicate is the primary key, while the second argument is the data in the relevant field. Thus, in the above example, the third field is mapped to the binary predicate population_of, whose extension is a set of pairs such as '(athens, 1368)'.

An exception to this general framework is required by the relations in the files borders.pl and contains.pl. These contain facts of the following form:

'borders(albania,greece).'

'contains0(africa,central_africa).'

We do not want to form a unary concept out the element in the first field of these records, and we want the label of the binary relation just to be 'border'/'contain' respectively.

In order to drive the extraction process, we use ‘relation metadata bundles’ which are Python dictionaries such as the following:

city = {'label': 'city',
        'closures': [],
        'schema': ['city', 'country', 'population'],
        'filename': 'cities.pl'}

According to this, the file city['filename'] contains a list of relational tuples (or more accurately, the corresponding strings in Prolog form) whose predicate symbol is city['label'] and whose relational schema is city['schema']. The notion of a closure is discussed in the next section.

Concepts

In order to encapsulate the results of the extraction, a class of Concept objects is introduced. A Concept object has a number of attributes, in particular a prefLabel and extension, which make it easier to inspect the output of the extraction. In addition, the extension can be further processed: in the case of the 'border' relation, we check that the relation is symmetric, and in the case of the 'contain' relation, we carry out the transitive closure. The closure properties associated with a concept is indicated in the relation metadata, as indicated earlier.

The extension of a Concept object is then incorporated into a Valuation object.

Persistence

The functions val_dump and val_load are provided to allow a valuation to be stored in a persistent database and re-loaded, rather than having to be re-computed each time.

Individuals and Lexical Items

As well as deriving relations from the Chat-80 data, we also create a set of individual constants, one for each entity in the domain. The individual constants are string-identical to the entities. For example, given a data item such as 'zloty', we add to the valuation a pair ('zloty', 'zloty'). In order to parse English sentences that refer to these entities, we also create a lexical item such as the following for each individual constant:

PropN[num=sg, sem=<\P.(P zloty)>] -> 'Zloty'

The set of rules is written to the file chat_pnames.cfg in the current directory.

class nltk.sem.chat80.Concept(prefLabel, arity, altLabels=[], closures=[], extension=set())[source]

Bases: object

A Concept class, loosely based on SKOS (http://www.w3.org/TR/swbp-skos-core-guide/).

augment(data)[source]

Add more data to the Concept‘s extension set.

Parameters:data (string or pair of strings) – a new semantic value
Return type:set
close()[source]

Close a binary relation in the Concept‘s extension set.

Returns:a new extension for the Concept in which the relation is closed under a given property
unicode_repr()
nltk.sem.chat80.binary_concept(label, closures, subj, obj, records)[source]

Make a binary concept out of the primary key and another field in a record.

A record is a list of entities in some relation, such as ['france', 'paris'], where 'france' is acting as the primary key, and 'paris' stands in the 'capital_of' relation to 'france'.

More generally, given a record such as ['a', 'b', 'c'], where label is bound to 'B', and obj bound to 1, the derived binary concept will have label 'B_of', and its extension will be a set of pairs such as ('a', 'b').

Parameters:
  • label (str) – the base part of the preferred label for the concept
  • closures (list) – closure properties for the extension of the concept
  • subj (int) – position in the record of the subject of the predicate
  • obj (int) – position in the record of the object of the predicate
  • records (list of lists) – a list of records
Returns:

Concept of arity 2

Return type:

Concept

nltk.sem.chat80.cities2table(filename, rel_name, dbname, verbose=False, setup=False)[source]

Convert a file of Prolog clauses into a database table.

This is not generic, since it doesn’t allow arbitrary schemas to be set as a parameter.

Intended usage:

cities2table('cities.pl', 'city', 'city.db', verbose=True, setup=True)
Parameters:
  • filename (str) – filename containing the relations
  • rel_name (str) – name of the relation
  • dbname – filename of persistent store
nltk.sem.chat80.clause2concepts(filename, rel_name, schema, closures=[])[source]

Convert a file of Prolog clauses into a list of Concept objects.

Parameters:
  • filename (str) – filename containing the relations
  • rel_name (str) – name of the relation
  • schema (list) – the schema used in a set of relational tuples
  • closures (list) – closure properties for the extension of the concept
Returns:

a list of Concept objects

Return type:

list

nltk.sem.chat80.concepts(items=('borders', 'circle_of_lat', 'circle_of_long', 'city', 'contains', 'continent', 'country', 'ocean', 'region', 'sea'))[source]

Build a list of concepts corresponding to the relation names in items.

Parameters:items (list(str)) – names of the Chat-80 relations to extract
Returns:the Concept objects which are extracted from the relations
Return type:list(Concept)
nltk.sem.chat80.label_indivs(valuation, lexicon=False)[source]

Assign individual constants to the individuals in the domain of a Valuation.

Given a valuation with an entry of the form {'rel': {'a': True}}, add a new entry {'a': 'a'}.

Return type:Valuation
nltk.sem.chat80.main()[source]
nltk.sem.chat80.make_lex(symbols)[source]

Create lexical CFG rules for each individual symbol.

Given a valuation with an entry of the form {'zloty': 'zloty'}, create a lexical rule for the proper name ‘Zloty’.

Parameters:symbols (sequence -- set(str)) – a list of individual constants in the semantic representation
Return type:list(str)
nltk.sem.chat80.make_valuation(concepts, read=False, lexicon=False)[source]

Convert a list of Concept objects into a list of (label, extension) pairs; optionally create a Valuation object.

Parameters:
  • concepts (list(Concept)) – concepts
  • read (bool) – if True, (symbol, set) pairs are read into a Valuation
Return type:

list or Valuation

nltk.sem.chat80.process_bundle(rels)[source]

Given a list of relation metadata bundles, make a corresponding dictionary of concepts, indexed by the relation name.

Parameters:rels (list(dict)) – bundle of metadata needed for constructing a concept
Returns:a dictionary of concepts, indexed by the relation name.
Return type:dict(str): Concept
nltk.sem.chat80.sql_demo()[source]

Print out every row from the ‘city.db’ database.

nltk.sem.chat80.sql_query(dbname, query)[source]

Execute an SQL query over a database. :param dbname: filename of persistent store :type schema: str :param query: SQL query :type rel_name: str

nltk.sem.chat80.unary_concept(label, subj, records)[source]

Make a unary concept out of the primary key in a record.

A record is a list of entities in some relation, such as ['france', 'paris'], where 'france' is acting as the primary key.

Parameters:
  • label (string) – the preferred label for the concept
  • subj (int) – position in the record of the subject of the predicate
  • records (list of lists) – a list of records
Returns:

Concept of arity 1

Return type:

Concept

nltk.sem.chat80.val_dump(rels, db)[source]

Make a Valuation from a list of relation metadata bundles and dump to persistent database.

Parameters:
  • rels (list of dict) – bundle of metadata needed for constructing a concept
  • db (str) – name of file to which data is written. The suffix ‘.db’ will be automatically appended.
nltk.sem.chat80.val_load(db)[source]

Load a Valuation from a persistent database.

Parameters:db (str) – name of file from which data is read. The suffix ‘.db’ should be omitted from the name.

nltk.sem.cooper_storage module

class nltk.sem.cooper_storage.CooperStore(featstruct)[source]

Bases: object

A container for handling quantifier ambiguity via Cooper storage.

s_retrieve(trace=False)[source]

Carry out S-Retrieval of binding operators in store. If hack=True, serialize the bindop and core as strings and reparse. Ugh.

Each permutation of the store (i.e. list of binding operators) is taken to be a possible scoping of quantifiers. We iterate through the binding operators in each permutation, and successively apply them to the current term, starting with the core semantic representation, working from the inside out.

Binding operators are of the form:

bo(\P.all x.(man(x) -> P(x)),z1)
nltk.sem.cooper_storage.demo()[source]
nltk.sem.cooper_storage.parse_with_bindops(sentence, grammar=None, trace=0)[source]

Use a grammar with Binding Operators to parse a sentence.

nltk.sem.drt module

exception nltk.sem.drt.AnaphoraResolutionException[source]

Bases: Exception

class nltk.sem.drt.DRS(refs, conds, consequent=None)[source]

Bases: nltk.sem.drt.DrtExpression, nltk.sem.logic.Expression

A Discourse Representation Structure.

eliminate_equality()[source]
fol()[source]
free()[source]
See:Expression.free()
get_refs(recursive=False)[source]
See:AbstractExpression.get_refs()
replace(variable, expression, replace_bound=False, alpha_convert=True)[source]

Replace all instances of variable v with expression E in self, where v is free in self.

unicode_repr()
visit(function, combinator)[source]
See:Expression.visit()
visit_structured(function, combinator)[source]
See:Expression.visit_structured()
class nltk.sem.drt.DrsDrawer(drs, size_canvas=True, canvas=None)[source]

Bases: object

BUFFER = 3
OUTERSPACE = 6
TOPSPACE = 10
draw(x=6, y=10)[source]

Draw the DRS

class nltk.sem.drt.DrtAbstractVariableExpression(variable)[source]

Bases: nltk.sem.drt.DrtExpression, nltk.sem.logic.AbstractVariableExpression

eliminate_equality()[source]
fol()[source]
get_refs(recursive=False)[source]
See:AbstractExpression.get_refs()
class nltk.sem.drt.DrtApplicationExpression(function, argument)[source]

Bases: nltk.sem.drt.DrtExpression, nltk.sem.logic.ApplicationExpression

fol()[source]
get_refs(recursive=False)[source]
See:AbstractExpression.get_refs()
class nltk.sem.drt.DrtBinaryExpression(first, second)[source]

Bases: nltk.sem.drt.DrtExpression, nltk.sem.logic.BinaryExpression

get_refs(recursive=False)[source]
See:AbstractExpression.get_refs()
class nltk.sem.drt.DrtBooleanExpression(first, second)[source]

Bases: nltk.sem.drt.DrtBinaryExpression, nltk.sem.logic.BooleanExpression

class nltk.sem.drt.DrtConcatenation(first, second, consequent=None)[source]

Bases: nltk.sem.drt.DrtBooleanExpression

DRS of the form ‘(DRS + DRS)’

eliminate_equality()[source]
fol()[source]
getOp()[source]
get_refs(recursive=False)[source]
See:AbstractExpression.get_refs()
replace(variable, expression, replace_bound=False, alpha_convert=True)[source]

Replace all instances of variable v with expression E in self, where v is free in self.

simplify()[source]
unicode_repr()
visit(function, combinator)[source]
See:Expression.visit()
class nltk.sem.drt.DrtConstantExpression(variable)[source]

Bases: nltk.sem.drt.DrtAbstractVariableExpression, nltk.sem.logic.ConstantExpression

class nltk.sem.drt.DrtEqualityExpression(first, second)[source]

Bases: nltk.sem.drt.DrtBinaryExpression, nltk.sem.logic.EqualityExpression

fol()[source]
class nltk.sem.drt.DrtEventVariableExpression(variable)[source]

Bases: nltk.sem.drt.DrtIndividualVariableExpression, nltk.sem.logic.EventVariableExpression

class nltk.sem.drt.DrtExpression[source]

Bases: object

This is the base abstract DRT Expression from which every DRT Expression extends.

applyto(other)[source]
draw()[source]
eliminate_equality()[source]
equiv(other, prover=None)[source]

Check for logical equivalence. Pass the expression (self <-> other) to the theorem prover. If the prover says it is valid, then the self and other are equal.

Parameters:
  • other – an DrtExpression to check equality against
  • prover – a nltk.inference.api.Prover
classmethod fromstring(s)[source]
get_refs(recursive=False)[source]

Return the set of discourse referents in this DRS. :param recursive: bool Also find discourse referents in subterms? :return: list of Variable objects

is_pronoun_function()[source]

Is self of the form “PRO(x)”?

make_EqualityExpression(first, second)[source]
make_VariableExpression(variable)[source]
pretty_format()[source]

Draw the DRS :return: the pretty print string

pretty_print()[source]
resolve_anaphora()[source]
type
typecheck(signature=None)[source]
class nltk.sem.drt.DrtFunctionVariableExpression(variable)[source]

Bases: nltk.sem.drt.DrtAbstractVariableExpression, nltk.sem.logic.FunctionVariableExpression

class nltk.sem.drt.DrtIndividualVariableExpression(variable)[source]

Bases: nltk.sem.drt.DrtAbstractVariableExpression, nltk.sem.logic.IndividualVariableExpression

class nltk.sem.drt.DrtLambdaExpression(variable, term)[source]

Bases: nltk.sem.drt.DrtExpression, nltk.sem.logic.LambdaExpression

alpha_convert(newvar)[source]

Rename all occurrences of the variable introduced by this variable binder in the expression to newvar. :param newvar: Variable, for the new variable

fol()[source]
class nltk.sem.drt.DrtNegatedExpression(term)[source]

Bases: nltk.sem.drt.DrtExpression, nltk.sem.logic.NegatedExpression

fol()[source]
get_refs(recursive=False)[source]
See:AbstractExpression.get_refs()
class nltk.sem.drt.DrtOrExpression(first, second)[source]

Bases: nltk.sem.drt.DrtBooleanExpression, nltk.sem.logic.OrExpression

fol()[source]
class nltk.sem.drt.DrtParser[source]

Bases: nltk.sem.logic.LogicParser

A lambda calculus expression parser.

get_BooleanExpression_factory(tok)[source]

This method serves as a hook for other logic parsers that have different boolean operators

get_all_symbols()[source]

This method exists to be overridden

handle(tok, context)[source]

This method is intended to be overridden for logics that use different operators or expressions

handle_DRS(tok, context)[source]
handle_conds(context)[source]
handle_prop(tok, context)[source]
handle_refs()[source]
isvariable(tok)[source]
make_ApplicationExpression(function, argument)[source]
make_BooleanExpression(factory, first, second)[source]
make_EqualityExpression(first, second)[source]

This method serves as a hook for other logic parsers that have different equality expression classes

make_LambdaExpression(variables, term)[source]
make_NegatedExpression(expression)[source]
make_VariableExpression(name)[source]
class nltk.sem.drt.DrtProposition(variable, drs)[source]

Bases: nltk.sem.drt.DrtExpression, nltk.sem.logic.Expression

eliminate_equality()[source]
fol()[source]
get_refs(recursive=False)[source]
replace(variable, expression, replace_bound=False, alpha_convert=True)[source]
unicode_repr()
visit(function, combinator)[source]
See:Expression.visit()
visit_structured(function, combinator)[source]
See:Expression.visit_structured()
class nltk.sem.drt.DrtTokens[source]

Bases: nltk.sem.logic.Tokens

CLOSE_BRACKET = ']'
COLON = ':'
DRS = 'DRS'
DRS_CONC = '+'
OPEN_BRACKET = '['
PRONOUN = 'PRO'
PUNCT = ['+', '[', ']', ':']
SYMBOLS = ['&', '^', '|', '->', '=>', '<->', '<=>', '=', '==', '!=', '\\', '.', '(', ')', ',', '-', '!', '+', '[', ']', ':']
TOKENS = ['and', '&', '^', 'or', '|', 'implies', '->', '=>', 'iff', '<->', '<=>', '=', '==', '!=', 'some', 'exists', 'exist', 'all', 'forall', '\\', '.', '(', ')', ',', 'not', '-', '!', 'DRS', '+', '[', ']', ':']
nltk.sem.drt.DrtVariableExpression(variable)[source]

This is a factory method that instantiates and returns a subtype of DrtAbstractVariableExpression appropriate for the given variable.

class nltk.sem.drt.PossibleAntecedents[source]

Bases: list, nltk.sem.drt.DrtExpression, nltk.sem.logic.Expression

free()[source]

Set of free variables.

replace(variable, expression, replace_bound=False, alpha_convert=True)[source]

Replace all instances of variable v with expression E in self, where v is free in self.

unicode_repr

Return repr(self).

nltk.sem.drt.demo()[source]
nltk.sem.drt.resolve_anaphora(expression, trail=[])[source]
nltk.sem.drt.test_draw()[source]

nltk.sem.drt_glue_demo module

class nltk.sem.drt_glue_demo.DrsWidget(canvas, drs, **attribs)[source]

Bases: object

clear()[source]
draw()[source]
class nltk.sem.drt_glue_demo.DrtGlueDemo(examples)[source]

Bases: object

about(*e)[source]
destroy(*e)[source]
mainloop(*args, **kwargs)[source]

Enter the Tkinter mainloop. This function must be called if this demo is created from a non-interactive program (e.g. from a secript); otherwise, the demo will close as soon as the script completes.

next(*e)[source]
postscript(*e)[source]
prev(*e)[source]
resize(size=None)[source]
nltk.sem.drt_glue_demo.demo()[source]

nltk.sem.evaluate module

This module provides data structures for representing first-order models.

class nltk.sem.evaluate.Assignment(domain, assign=None)[source]

Bases: dict

A dictionary which represents an assignment of values to variables.

An assigment can only assign values from its domain.

If an unknown expression a is passed to a model M‘s interpretation function i, i will first check whether M‘s valuation assigns an interpretation to a as a constant, and if this fails, i will delegate the interpretation of a to g. g only assigns values to individual variables (i.e., members of the class IndividualVariableExpression in the logic module. If a variable is not assigned a value by g, it will raise an Undefined exception.

A variable Assignment is a mapping from individual variables to entities in the domain. Individual variables are usually indicated with the letters 'x', 'y', 'w' and 'z', optionally followed by an integer (e.g., 'x0', 'y332'). Assignments are created using the Assignment constructor, which also takes the domain as a parameter.

>>> from nltk.sem.evaluate import Assignment
>>> dom = set(['u1', 'u2', 'u3', 'u4'])
>>> g3 = Assignment(dom, [('x', 'u1'), ('y', 'u2')])
>>> g3 == {'x': 'u1', 'y': 'u2'}
True

There is also a print format for assignments which uses a notation closer to that in logic textbooks:

>>> print(g3)
g[u1/x][u2/y]

It is also possible to update an assignment using the add method:

>>> dom = set(['u1', 'u2', 'u3', 'u4'])
>>> g4 = Assignment(dom)
>>> g4.add('x', 'u1')
{'x': 'u1'}

With no arguments, purge() is equivalent to clear() on a dictionary:

>>> g4.purge()
>>> g4
{}
Parameters:
  • domain (set) – the domain of discourse
  • assign (list) – a list of (varname, value) associations
add(var, val)[source]

Add a new variable-value pair to the assignment, and update self.variant.

copy()[source]
purge(var=None)[source]

Remove one or all keys (i.e. logic variables) from an assignment, and update self.variant.

Parameters:var – a Variable acting as a key for the assignment.
unicode_repr

Return repr(self).

exception nltk.sem.evaluate.Error[source]

Bases: Exception

class nltk.sem.evaluate.Model(domain, valuation)[source]

Bases: object

A first order model is a domain D of discourse and a valuation V.

A domain D is a set, and a valuation V is a map that associates expressions with values in the model. The domain of V should be a subset of D.

Construct a new Model.

Parameters:
  • domain (set) – A set of entities representing the domain of discourse of the model.
  • valuation (Valuation) – the valuation of the model.
  • prop – If this is set, then we are building a propositional model and don’t require the domain of V to be subset of D.
evaluate(expr, g, trace=None)[source]

Read input expressions, and provide a handler for satisfy that blocks further propagation of the Undefined error. :param expr: An Expression of logic. :type g: Assignment :param g: an assignment to individual variables. :rtype: bool or ‘Undefined’

i(parsed, g, trace=False)[source]

An interpretation function.

Assuming that parsed is atomic:

  • if parsed is a non-logical constant, calls the valuation V
  • else if parsed is an individual variable, calls assignment g
  • else returns Undefined.
Parameters:
  • parsed – an Expression of logic.
  • g (Assignment) – an assignment to individual variables.
Returns:

a semantic value

satisfiers(parsed, varex, g, trace=None, nesting=0)[source]

Generate the entities from the model’s domain that satisfy an open formula.

Parameters:
Returns:

a set of the entities that satisfy parsed.

satisfy(parsed, g, trace=None)[source]

Recursive interpretation function for a formula of first-order logic.

Raises an Undefined error when parsed is an atomic string but is not a symbol or an individual variable.

Returns:

Returns a truth value or Undefined if parsed is complex, and calls the interpretation function i if parsed is atomic.

Parameters:
  • parsed – An expression of logic.
  • g (Assignment) – an assignment to individual variables.
unicode_repr()
exception nltk.sem.evaluate.Undefined[source]

Bases: nltk.sem.evaluate.Error

class nltk.sem.evaluate.Valuation(xs)[source]

Bases: dict

A dictionary which represents a model-theoretic Valuation of non-logical constants. Keys are strings representing the constants to be interpreted, and values correspond to individuals (represented as strings) and n-ary relations (represented as sets of tuples of strings).

An instance of Valuation will raise a KeyError exception (i.e., just behave like a standard dictionary) if indexed with an expression that is not in its list of symbols.

domain

Set-theoretic domain of the value-space of a Valuation.

classmethod fromstring(s)[source]
symbols

The non-logical constants which the Valuation recognizes.

unicode_repr

Return repr(self).

nltk.sem.evaluate.arity(rel)[source]

Check the arity of a relation. :type rel: set of tuples :rtype: int of tuple of str

nltk.sem.evaluate.demo(num=0, trace=None)[source]

Run exists demos.

  • num = 1: propositional logic demo
  • num = 2: first order model demo (only if trace is set)
  • num = 3: first order sentences demo
  • num = 4: satisfaction of open formulas demo
  • any other value: run all the demos
Parameters:trace – trace = 1, or trace = 2 for more verbose tracing
nltk.sem.evaluate.foldemo(trace=None)[source]

Interpretation of closed expressions in a first-order model.

nltk.sem.evaluate.folmodel(quiet=False, trace=None)[source]

Example of a first-order model.

nltk.sem.evaluate.is_rel(s)[source]

Check whether a set represents a relation (of any arity).

Parameters:s (set) – a set containing tuples of str elements
Return type:bool
nltk.sem.evaluate.propdemo(trace=None)[source]

Example of a propositional model.

nltk.sem.evaluate.read_valuation(s, encoding=None)[source]

Convert a valuation string into a valuation.

Parameters:
  • s (str) – a valuation string
  • encoding (str) – the encoding of the input string, if it is binary
Returns:

a nltk.sem valuation

Return type:

Valuation

nltk.sem.evaluate.satdemo(trace=None)[source]

Satisfiers of an open formula in a first order model.

nltk.sem.evaluate.set2rel(s)[source]

Convert a set containing individuals (strings or numbers) into a set of unary tuples. Any tuples of strings already in the set are passed through unchanged.

For example:
  • set([‘a’, ‘b’]) => set([(‘a’,), (‘b’,)])
  • set([3, 27]) => set([(‘3’,), (‘27’,)])
Return type:set of tuple of str
nltk.sem.evaluate.trace(f, *args, **kw)[source]

nltk.sem.glue module

class nltk.sem.glue.DrtGlue(semtype_file=None, remove_duplicates=False, depparser=None, verbose=False)[source]

Bases: nltk.sem.glue.Glue

get_glue_dict()[source]
class nltk.sem.glue.DrtGlueDict(filename, encoding=None)[source]

Bases: nltk.sem.glue.GlueDict

get_GlueFormula_factory()[source]
class nltk.sem.glue.DrtGlueFormula(meaning, glue, indices=None)[source]

Bases: nltk.sem.glue.GlueFormula

make_LambdaExpression(variable, term)[source]
make_VariableExpression(name)[source]
class nltk.sem.glue.Glue(semtype_file=None, remove_duplicates=False, depparser=None, verbose=False)[source]

Bases: object

dep_parse(sentence)[source]

Return a dependency graph for the sentence.

Parameters:sentence (list(str)) – the sentence to be parsed
Return type:DependencyGraph
depgraph_to_glue(depgraph)[source]
get_glue_dict()[source]
get_pos_tagger()[source]
get_readings(agenda)[source]
gfl_to_compiled(gfl)[source]
parse_to_compiled(sentence)[source]
parse_to_meaning(sentence)[source]
train_depparser(depgraphs=None)[source]
class nltk.sem.glue.GlueDict(filename, encoding=None)[source]

Bases: dict

add_missing_dependencies(node, depgraph)[source]
find_label_name(name, node, depgraph, unique_index)[source]
get_GlueFormula_factory()[source]
get_glueformulas_from_semtype_entry(lookup, word, node, depgraph, counter)[source]
get_label(node)[source]

Pick an alphabetic character as identifier for an entity in the model.

Parameters:value (int) – where to index into the list of characters
get_meaning_formula(generic, word)[source]
Parameters:generic – A meaning formula string containing the

parameter “<word>” :param word: The actual word to be replace “<word>”

get_semtypes(node)[source]

Based on the node, return a list of plausible semtypes in order of plausibility.

initialize_labels(expr, node, depgraph, unique_index)[source]
lookup(node, depgraph, counter)[source]
lookup_unique(rel, node, depgraph)[source]

Lookup ‘key’. There should be exactly one item in the associated relation.

read_file(empty_first=True)[source]
to_glueformula_list(depgraph, node=None, counter=None, verbose=False)[source]
unicode_repr

Return repr(self).

class nltk.sem.glue.GlueFormula(meaning, glue, indices=None)[source]

Bases: object

applyto(arg)[source]

self = (x.(walk x), (subj -o f)) arg = (john , subj) returns ((walk john), f)

compile(counter=None)[source]

From Iddo Lev’s PhD Dissertation p108-109

lambda_abstract(other)[source]
make_LambdaExpression(variable, term)[source]
make_VariableExpression(name)[source]
simplify()[source]
unicode_repr()
nltk.sem.glue.demo(show_example=-1)[source]

nltk.sem.hole module

An implementation of the Hole Semantics model, following Blackburn and Bos, Representation and Inference for Natural Language (CSLI, 2005).

The semantic representations are built by the grammar hole.fcfg. This module contains driver code to read in sentences and parse them according to a hole semantics grammar.

After parsing, the semantic representation is in the form of an underspecified representation that is not easy to read. We use a “plugging” algorithm to convert that representation into first-order logic formulas.

class nltk.sem.hole.Constants[source]

Bases: object

ALL = 'ALL'
AND = 'AND'
EXISTS = 'EXISTS'
HOLE = 'HOLE'
IFF = 'IFF'
IMP = 'IMP'
LABEL = 'LABEL'
LEQ = 'LEQ'
MAP = {'ALL': <function Constants.<lambda>>, 'OR': <class 'nltk.sem.logic.OrExpression'>, 'PRED': <class 'nltk.sem.logic.ApplicationExpression'>, 'NOT': <class 'nltk.sem.logic.NegatedExpression'>, 'AND': <class 'nltk.sem.logic.AndExpression'>, 'IFF': <class 'nltk.sem.logic.IffExpression'>, 'EXISTS': <function Constants.<lambda>>, 'IMP': <class 'nltk.sem.logic.ImpExpression'>}
NOT = 'NOT'
OR = 'OR'
PRED = 'PRED'
class nltk.sem.hole.Constraint(lhs, rhs)[source]

Bases: object

This class represents a constraint of the form (L =< N), where L is a label and N is a node (a label or a hole).

unicode_repr()
class nltk.sem.hole.HoleSemantics(usr)[source]

Bases: object

This class holds the broken-down components of a hole semantics, i.e. it extracts the holes, labels, logic formula fragments and constraints out of a big conjunction of such as produced by the hole semantics grammar. It then provides some operations on the semantics dealing with holes, labels and finding legal ways to plug holes with labels.

formula_tree(plugging)[source]

Return the first-order logic formula tree for this underspecified representation using the plugging given.

is_node(x)[source]

Return true if x is a node (label or hole) in this semantic representation.

pluggings()[source]

Calculate and return all the legal pluggings (mappings of labels to holes) of this semantics given the constraints.

nltk.sem.hole.hole_readings(sentence, grammar_filename=None, verbose=False)[source]

nltk.sem.lfg module

class nltk.sem.lfg.FStructure[source]

Bases: dict

pretty_format(indent=3)[source]
static read_depgraph(depgraph)[source]
safeappend(key, item)[source]

Append ‘item’ to the list at ‘key’. If no list exists for ‘key’, then construct one.

to_depgraph(rel=None)[source]
to_glueformula_list(glue_dict)[source]
unicode_repr()
nltk.sem.lfg.demo_read_depgraph()[source]

nltk.sem.linearlogic module

class nltk.sem.linearlogic.ApplicationExpression(function, argument, argument_indices=None)[source]

Bases: nltk.sem.linearlogic.Expression

simplify(bindings=None)[source]

Since function is an implication, return its consequent. There should be no need to check that the application is valid since the checking is done by the constructor.

Parameters:bindingsBindingDict A dictionary of bindings used to simplify
Returns:Expression
unicode_repr()
class nltk.sem.linearlogic.AtomicExpression(name, dependencies=None)[source]

Bases: nltk.sem.linearlogic.Expression

compile_neg(index_counter, glueFormulaFactory)[source]

From Iddo Lev’s PhD Dissertation p108-109

Parameters:
  • index_counterCounter for unique indices
  • glueFormulaFactoryGlueFormula for creating new glue formulas
Returns:

(Expression,set) for the compiled linear logic and any newly created glue formulas

compile_pos(index_counter, glueFormulaFactory)[source]

From Iddo Lev’s PhD Dissertation p108-109

Parameters:
  • index_counterCounter for unique indices
  • glueFormulaFactoryGlueFormula for creating new glue formulas
Returns:

(Expression,set) for the compiled linear logic and any newly created glue formulas

initialize_labels(fstruct)[source]
simplify(bindings=None)[source]

If ‘self’ is bound by ‘bindings’, return the atomic to which it is bound. Otherwise, return self.

Parameters:bindingsBindingDict A dictionary of bindings used to simplify
Returns:AtomicExpression
unicode_repr()
class nltk.sem.linearlogic.BindingDict(bindings=None)[source]

Bases: object

unicode_repr()
class nltk.sem.linearlogic.ConstantExpression(name, dependencies=None)[source]

Bases: nltk.sem.linearlogic.AtomicExpression

unify(other, bindings)[source]

If ‘other’ is a constant, then it must be equal to ‘self’. If ‘other’ is a variable, then it must not be bound to anything other than ‘self’.

Parameters:
  • otherExpression
  • bindingsBindingDict A dictionary of all current bindings
Returns:

BindingDict A new combined dictionary of of ‘bindings’ and any new binding

Raises:

UnificationException – If ‘self’ and ‘other’ cannot be unified in the context of ‘bindings’

class nltk.sem.linearlogic.Expression[source]

Bases: object

applyto(other, other_indices=None)[source]
classmethod fromstring(s)[source]
unicode_repr()
class nltk.sem.linearlogic.ImpExpression(antecedent, consequent)[source]

Bases: nltk.sem.linearlogic.Expression

compile_neg(index_counter, glueFormulaFactory)[source]

From Iddo Lev’s PhD Dissertation p108-109

Parameters:
  • index_counterCounter for unique indices
  • glueFormulaFactoryGlueFormula for creating new glue formulas
Returns:

(Expression,list of GlueFormula) for the compiled linear logic and any newly created glue formulas

compile_pos(index_counter, glueFormulaFactory)[source]

From Iddo Lev’s PhD Dissertation p108-109

Parameters:
  • index_counterCounter for unique indices
  • glueFormulaFactoryGlueFormula for creating new glue formulas
Returns:

(Expression,set) for the compiled linear logic and any newly created glue formulas

initialize_labels(fstruct)[source]
simplify(bindings=None)[source]
unicode_repr()
unify(other, bindings)[source]

Both the antecedent and consequent of ‘self’ and ‘other’ must unify.

Parameters:
  • otherImpExpression
  • bindingsBindingDict A dictionary of all current bindings
Returns:

BindingDict A new combined dictionary of of ‘bindings’ and any new bindings

Raises:

UnificationException – If ‘self’ and ‘other’ cannot be unified in the context of ‘bindings’

exception nltk.sem.linearlogic.LinearLogicApplicationException[source]

Bases: Exception

class nltk.sem.linearlogic.LinearLogicParser[source]

Bases: nltk.sem.logic.LogicParser

A linear logic expression parser.

attempt_ApplicationExpression(expression, context)[source]

Attempt to make an application expression. If the next tokens are an argument in parens, then the argument expression is a function being applied to the arguments. Otherwise, return the argument expression.

get_BooleanExpression_factory(tok)[source]
get_all_symbols()[source]
handle(tok, context)[source]
make_BooleanExpression(factory, first, second)[source]
make_VariableExpression(name)[source]
class nltk.sem.linearlogic.Tokens[source]

Bases: object

CLOSE = ')'
IMP = '-o'
OPEN = '('
PUNCT = ['(', ')']
TOKENS = ['(', ')', '-o']
exception nltk.sem.linearlogic.UnificationException(a, b, bindings)[source]

Bases: Exception

exception nltk.sem.linearlogic.VariableBindingException[source]

Bases: Exception

class nltk.sem.linearlogic.VariableExpression(name, dependencies=None)[source]

Bases: nltk.sem.linearlogic.AtomicExpression

unify(other, bindings)[source]

‘self’ must not be bound to anything other than ‘other’.

Parameters:
  • otherExpression
  • bindingsBindingDict A dictionary of all current bindings
Returns:

BindingDict A new combined dictionary of of ‘bindings’ and the new binding

Raises:

UnificationException – If ‘self’ and ‘other’ cannot be unified in the context of ‘bindings’

nltk.sem.linearlogic.demo()[source]

nltk.sem.logic module

A version of first order predicate logic, built on top of the typed lambda calculus.

class nltk.sem.logic.AbstractVariableExpression(variable)[source]

Bases: nltk.sem.logic.Expression

This class represents a variable to be used as a predicate or entity

findtype(variable)[source]

:see Expression.findtype()

predicates()[source]
See:Expression.predicates()
replace(variable, expression, replace_bound=False, alpha_convert=True)[source]
See:Expression.replace()
simplify()[source]
unicode_repr()
class nltk.sem.logic.AllExpression(variable, term)[source]

Bases: nltk.sem.logic.QuantifiedExpression

getQuantifier()[source]
class nltk.sem.logic.AndExpression(first, second)[source]

Bases: nltk.sem.logic.BooleanExpression

This class represents conjunctions

getOp()[source]
class nltk.sem.logic.AnyType[source]

Bases: nltk.sem.logic.BasicType, nltk.sem.logic.ComplexType

first
matches(other)[source]
resolve(other)[source]
second
str()[source]
unicode_repr()
class nltk.sem.logic.ApplicationExpression(function, argument)[source]

Bases: nltk.sem.logic.Expression

This class is used to represent two related types of logical expressions.

The first is a Predicate Expression, such as “P(x,y)”. A predicate expression is comprised of a FunctionVariableExpression or ConstantExpression as the predicate and a list of Expressions as the arguments.

The second is a an application of one expression to another, such as “(x.dog(x))(fido)”.

The reason Predicate Expressions are treated as Application Expressions is that the Variable Expression predicate of the expression may be replaced with another Expression, such as a LambdaExpression, which would mean that the Predicate should be thought of as being applied to the arguments.

The logical expression reader will always curry arguments in a application expression. So, “x y.see(x,y)(john,mary)” will be represented internally as “((x y.(see(x))(y))(john))(mary)”. This simplifies the internals since there will always be exactly one argument in an application.

The str() method will usually print the curried forms of application expressions. The one exception is when the the application expression is really a predicate expression (ie, underlying function is an AbstractVariableExpression). This means that the example from above will be returned as “(x y.see(x,y)(john))(mary)”.

args

Return uncurried arg-list

constants()[source]
See:Expression.constants()
findtype(variable)[source]

:see Expression.findtype()

is_atom()[source]

Is this expression an atom (as opposed to a lambda expression applied to a term)?

pred

Return uncurried base-function. If this is an atom, then the result will be a variable expression. Otherwise, it will be a lambda expression.

predicates()[source]
See:Expression.predicates()
simplify()[source]
type
uncurry()[source]

Uncurry this application expression

return: A tuple (base-function, arg-list)

unicode_repr()
visit(function, combinator)[source]
See:Expression.visit()
class nltk.sem.logic.BasicType[source]

Bases: nltk.sem.logic.Type

matches(other)[source]
resolve(other)[source]
class nltk.sem.logic.BinaryExpression(first, second)[source]

Bases: nltk.sem.logic.Expression

findtype(variable)[source]

:see Expression.findtype()

type
unicode_repr()
visit(function, combinator)[source]
See:Expression.visit()
class nltk.sem.logic.BooleanExpression(first, second)[source]

Bases: nltk.sem.logic.BinaryExpression

class nltk.sem.logic.ComplexType(first, second)[source]

Bases: nltk.sem.logic.Type

matches(other)[source]
resolve(other)[source]
str()[source]
unicode_repr()
class nltk.sem.logic.ConstantExpression(variable)[source]

Bases: nltk.sem.logic.AbstractVariableExpression

This class represents variables that do not take the form of a single character followed by zero or more digits.

constants()[source]
See:Expression.constants()
free()[source]
See:Expression.free()
type = e
class nltk.sem.logic.EntityType[source]

Bases: nltk.sem.logic.BasicType

str()[source]
unicode_repr()
class nltk.sem.logic.EqualityExpression(first, second)[source]

Bases: nltk.sem.logic.BinaryExpression

This class represents equality expressions like “(x = y)”.

getOp()[source]
class nltk.sem.logic.EventType[source]

Bases: nltk.sem.logic.BasicType

str()[source]
unicode_repr()
class nltk.sem.logic.EventVariableExpression(variable)[source]

Bases: nltk.sem.logic.IndividualVariableExpression

This class represents variables that take the form of a single lowercase ‘e’ character followed by zero or more digits.

type = v
class nltk.sem.logic.ExistsExpression(variable, term)[source]

Bases: nltk.sem.logic.QuantifiedExpression

getQuantifier()[source]
exception nltk.sem.logic.ExpectedMoreTokensException(index, message=None)[source]

Bases: nltk.sem.logic.LogicalExpressionException

class nltk.sem.logic.Expression[source]

Bases: nltk.sem.logic.SubstituteBindingsI

This is the base abstract object for all logical expressions

applyto(other)[source]
constants()[source]

Return a set of individual constants (non-predicates). :return: set of Variable objects

equiv(other, prover=None)[source]

Check for logical equivalence. Pass the expression (self <-> other) to the theorem prover. If the prover says it is valid, then the self and other are equal.

Parameters:
  • other – an Expression to check equality against
  • prover – a nltk.inference.api.Prover
findtype(variable)[source]

Find the type of the given variable as it is used in this expression. For example, finding the type of “P” in “P(x) & Q(x,y)” yields “<e,t>”

Parameters:variable – Variable
free()[source]

Return a set of all the free (non-bound) variables. This includes both individual and predicate variables, but not constants. :return: set of Variable objects

classmethod fromstring(s, type_check=False, signature=None)[source]
make_VariableExpression(variable)[source]
negate()[source]

If this is a negated expression, remove the negation. Otherwise add a negation.

normalize(newvars=None)[source]

Rename auto-generated unique variables

predicates()[source]

Return a set of predicates (constants, not variables). :return: set of Variable objects

replace(variable, expression, replace_bound=False, alpha_convert=True)[source]

Replace every instance of ‘variable’ with ‘expression’ :param variable: Variable The variable to replace :param expression: Expression The expression with which to replace it :param replace_bound: bool Should bound variables be replaced? :param alpha_convert: bool Alpha convert automatically to avoid name clashes?

simplify()[source]
Returns:beta-converted version of this expression
substitute_bindings(bindings)[source]
typecheck(signature=None)[source]

Infer and check types. Raise exceptions if necessary.

Parameters:signature – dict that maps variable names to types (or string representations of types)
Returns:the signature, plus any additional type mappings
unicode_repr()
variables()[source]

Return a set of all the variables for binding substitution. The variables returned include all free (non-bound) individual variables and any variable starting with ‘?’ or ‘@’. :return: set of Variable objects

visit(function, combinator)[source]

Recursively visit subexpressions. Apply ‘function’ to each subexpression and pass the result of each function application to the ‘combinator’ for aggregation:

return combinator(map(function, self.subexpressions))

Bound variables are neither applied upon by the function nor given to the combinator. :param function: Function<Expression,T> to call on each subexpression :param combinator: Function<list<T>,R> to combine the results of the function calls :return: result of combination R

visit_structured(function, combinator)[source]

Recursively visit subexpressions. Apply ‘function’ to each subexpression and pass the result of each function application to the ‘combinator’ for aggregation. The combinator must have the same signature as the constructor. The function is not applied to bound variables, but they are passed to the combinator. :param function: Function to call on each subexpression :param combinator: Function with the same signature as the constructor, to combine the results of the function calls :return: result of combination

class nltk.sem.logic.FunctionVariableExpression(variable)[source]

Bases: nltk.sem.logic.AbstractVariableExpression

This class represents variables that take the form of a single uppercase character followed by zero or more digits.

constants()[source]
See:Expression.constants()
free()[source]
See:Expression.free()
type = ?
class nltk.sem.logic.IffExpression(first, second)[source]

Bases: nltk.sem.logic.BooleanExpression

This class represents biconditionals

getOp()[source]
exception nltk.sem.logic.IllegalTypeException(expression, other_type, allowed_type)[source]

Bases: nltk.sem.logic.TypeException

class nltk.sem.logic.ImpExpression(first, second)[source]

Bases: nltk.sem.logic.BooleanExpression

This class represents implications

getOp()[source]
exception nltk.sem.logic.InconsistentTypeHierarchyException(variable, expression=None)[source]

Bases: nltk.sem.logic.TypeException

class nltk.sem.logic.IndividualVariableExpression(variable)[source]

Bases: nltk.sem.logic.AbstractVariableExpression

This class represents variables that take the form of a single lowercase character (other than ‘e’) followed by zero or more digits.

constants()[source]
See:Expression.constants()
free()[source]
See:Expression.free()
type
class nltk.sem.logic.LambdaExpression(variable, term)[source]

Bases: nltk.sem.logic.VariableBinderExpression

type
unicode_repr()
class nltk.sem.logic.LogicParser(type_check=False)[source]

Bases: object

A lambda calculus expression parser.

assertNextToken(expected)[source]
assertToken(tok, expected)[source]
attempt_ApplicationExpression(expression, context)[source]

Attempt to make an application expression. The next tokens are a list of arguments in parens, then the argument expression is a function being applied to the arguments. Otherwise, return the argument expression.

attempt_BooleanExpression(expression, context)[source]

Attempt to make a boolean expression. If the next token is a boolean operator, then a BooleanExpression will be returned. Otherwise, the parameter will be returned.

attempt_EqualityExpression(expression, context)[source]

Attempt to make an equality expression. If the next token is an equality operator, then an EqualityExpression will be returned. Otherwise, the parameter will be returned.

attempt_adjuncts(expression, context)[source]
get_BooleanExpression_factory(tok)[source]

This method serves as a hook for other logic parsers that have different boolean operators

get_QuantifiedExpression_factory(tok)[source]

This method serves as a hook for other logic parsers that have different quantifiers

get_all_symbols()[source]

This method exists to be overridden

get_next_token_variable(description)[source]
handle(tok, context)[source]

This method is intended to be overridden for logics that use different operators or expressions

handle_lambda(tok, context)[source]
handle_negation(tok, context)[source]
handle_open(tok, context)[source]
handle_quant(tok, context)[source]
handle_variable(tok, context)[source]
has_priority(operation, context)[source]
inRange(location)[source]

Return TRUE if the given location is within the buffer

isvariable(tok)[source]
make_ApplicationExpression(function, argument)[source]
make_BooleanExpression(factory, first, second)[source]
make_EqualityExpression(first, second)[source]

This method serves as a hook for other logic parsers that have different equality expression classes

make_LambdaExpression(variable, term)[source]
make_NegatedExpression(expression)[source]
make_QuanifiedExpression(factory, variable, term)[source]
make_VariableExpression(name)[source]
parse(data, signature=None)[source]

Parse the expression.

Parameters:
  • data – str for the input to be parsed
  • signaturedict<str, str> that maps variable names to type

strings :returns: a parsed Expression

process(data)[source]

Split the data into tokens

process_next_expression(context)[source]

Parse the next complete expression from the stream and return it.

process_quoted_token(data_idx, data)[source]
token(location=None)[source]

Get the next waiting token. If a location is given, then return the token at currentIndex+location without advancing currentIndex; setting it gives lookahead/lookback capability.

type_check = None

A list of tuples of quote characters. The 4-tuple is comprised of the start character, the end character, the escape character, and a boolean indicating whether the quotes should be included in the result. Quotes are used to signify that a token should be treated as atomic, ignoring any special characters within the token. The escape character allows the quote end character to be used within the quote. If True, the boolean indicates that the final token should contain the quote and escape characters. This method exists to be overridden

unicode_repr()
exception nltk.sem.logic.LogicalExpressionException(index, message)[source]

Bases: Exception

class nltk.sem.logic.NegatedExpression(term)[source]

Bases: nltk.sem.logic.Expression

findtype(variable)[source]
negate()[source]
See:Expression.negate()
type
unicode_repr()
visit(function, combinator)[source]
See:Expression.visit()
class nltk.sem.logic.OrExpression(first, second)[source]

Bases: nltk.sem.logic.BooleanExpression

This class represents disjunctions

getOp()[source]
class nltk.sem.logic.QuantifiedExpression(variable, term)[source]

Bases: nltk.sem.logic.VariableBinderExpression

type
unicode_repr()
class nltk.sem.logic.SubstituteBindingsI[source]

Bases: object

An interface for classes that can perform substitutions for variables.

substitute_bindings(bindings)[source]
Returns:The object that is obtained by replacing each variable bound by bindings with its values. Aliases are already resolved. (maybe?)
Return type:(any)
variables()[source]
Returns:A list of all variables in this object.
class nltk.sem.logic.Tokens[source]

Bases: object

ALL = 'all'
ALL_LIST = ['all', 'forall']
AND = '&'
AND_LIST = ['and', '&', '^']
BINOPS = ['and', '&', '^', 'or', '|', 'implies', '->', '=>', 'iff', '<->', '<=>']
CLOSE = ')'
COMMA = ','
DOT = '.'
EQ = '='
EQ_LIST = ['=', '==']
EXISTS = 'exists'
EXISTS_LIST = ['some', 'exists', 'exist']
IFF = '<->'
IFF_LIST = ['iff', '<->', '<=>']
IMP = '->'
IMP_LIST = ['implies', '->', '=>']
LAMBDA = '\\'
LAMBDA_LIST = ['\\']
NEQ = '!='
NEQ_LIST = ['!=']
NOT = '-'
NOT_LIST = ['not', '-', '!']
OPEN = '('
OR = '|'
OR_LIST = ['or', '|']
PUNCT = ['.', '(', ')', ',']
QUANTS = ['some', 'exists', 'exist', 'all', 'forall']
SYMBOLS = ['&', '^', '|', '->', '=>', '<->', '<=>', '=', '==', '!=', '\\', '.', '(', ')', ',', '-', '!']
TOKENS = ['and', '&', '^', 'or', '|', 'implies', '->', '=>', 'iff', '<->', '<=>', '=', '==', '!=', 'some', 'exists', 'exist', 'all', 'forall', '\\', '.', '(', ')', ',', 'not', '-', '!']
class nltk.sem.logic.TruthValueType[source]

Bases: nltk.sem.logic.BasicType

str()[source]
unicode_repr()
class nltk.sem.logic.Type[source]

Bases: object

classmethod fromstring(s)[source]
unicode_repr()
exception nltk.sem.logic.TypeException(msg)[source]

Bases: Exception

exception nltk.sem.logic.TypeResolutionException(expression, other_type)[source]

Bases: nltk.sem.logic.TypeException

exception nltk.sem.logic.UnexpectedTokenException(index, unexpected=None, expected=None, message=None)[source]

Bases: nltk.sem.logic.LogicalExpressionException

class nltk.sem.logic.Variable(name)[source]

Bases: object

substitute_bindings(bindings)[source]
unicode_repr()
class nltk.sem.logic.VariableBinderExpression(variable, term)[source]

Bases: nltk.sem.logic.Expression

This an abstract class for any Expression that binds a variable in an Expression. This includes LambdaExpressions and Quantified Expressions

alpha_convert(newvar)[source]

Rename all occurrences of the variable introduced by this variable binder in the expression to newvar. :param newvar: Variable, for the new variable

findtype(variable)[source]

:see Expression.findtype()

free()[source]
See:Expression.free()
replace(variable, expression, replace_bound=False, alpha_convert=True)[source]
See:Expression.replace()
visit(function, combinator)[source]
See:Expression.visit()
visit_structured(function, combinator)[source]
See:Expression.visit_structured()
nltk.sem.logic.VariableExpression(variable)[source]

This is a factory method that instantiates and returns a subtype of AbstractVariableExpression appropriate for the given variable.

nltk.sem.logic.binding_ops()[source]

Binding operators

nltk.sem.logic.boolean_ops()[source]

Boolean operators

nltk.sem.logic.demo()[source]
nltk.sem.logic.demoException(s)[source]
nltk.sem.logic.demo_errors()[source]
nltk.sem.logic.equality_preds()[source]

Equality predicates

nltk.sem.logic.is_eventvar(expr)[source]

An event variable must be a single lowercase ‘e’ character followed by zero or more digits.

Parameters:expr – str
Returns:bool True if expr is of the correct form
nltk.sem.logic.is_funcvar(expr)[source]

A function variable must be a single uppercase character followed by zero or more digits.

Parameters:expr – str
Returns:bool True if expr is of the correct form
nltk.sem.logic.is_indvar(expr)[source]

An individual variable must be a single lowercase character other than ‘e’, followed by zero or more digits.

Parameters:expr – str
Returns:bool True if expr is of the correct form
nltk.sem.logic.printtype(ex)[source]
nltk.sem.logic.read_logic(s, logic_parser=None, encoding=None)[source]

Convert a file of First Order Formulas into a list of {Expression}s.

Parameters:
  • s (str) – the contents of the file
  • logic_parser (LogicParser) – The parser to be used to parse the logical expression
  • encoding (str) – the encoding of the input string, if it is binary
Returns:

a list of parsed formulas.

Return type:

list(Expression)

nltk.sem.logic.read_type(type_string)[source]
nltk.sem.logic.skolem_function(univ_scope=None)[source]

Return a skolem function over the variables in univ_scope param univ_scope

nltk.sem.logic.typecheck(expressions, signature=None)[source]

Ensure correct typing across a collection of Expression objects. :param expressions: a collection of expressions :param signature: dict that maps variable names to types (or string representations of types)

nltk.sem.logic.unique_variable(pattern=None, ignore=None)[source]

Return a new, unique variable.

Parameters:
  • patternVariable that is being replaced. The new variable must be the same type.
  • term – a set of Variable objects that should not be returned from this function.
Return type:

Variable

nltk.sem.relextract module

Code for extracting relational triples from the ieer and conll2002 corpora.

Relations are stored internally as dictionaries (‘reldicts’).

The two serialization outputs are “rtuple” and “clause”.

  • An rtuple is a tuple of the form (subj, filler, obj), where subj and obj are pairs of Named Entity mentions, and filler is the string of words occurring between sub and obj (with no intervening NEs). Strings are printed via repr() to circumvent locale variations in rendering utf-8 encoded strings.
  • A clause is an atom of the form relsym(subjsym, objsym), where the relation, subject and object have been canonicalized to single strings.
nltk.sem.relextract.class_abbrev(type)[source]

Abbreviate an NE class name. :type type: str :rtype: str

nltk.sem.relextract.clause(reldict, relsym)[source]

Print the relation in clausal form. :param reldict: a relation dictionary :type reldict: defaultdict :param relsym: a label for the relation :type relsym: str

nltk.sem.relextract.conllesp()[source]
nltk.sem.relextract.conllned(trace=1)[source]

Find the copula+’van’ relation (‘of’) in the Dutch tagged training corpus from CoNLL 2002.

nltk.sem.relextract.descape_entity(m, defs={'bull': '•', 'spades': '♠', 'xi': 'ξ', 'AElig': 'Æ', 'ge': '≥', 'rlm': '\u200f', 'beta': 'β', 'Eta': 'Η', 'Eacute': 'É', 'Rho': 'Ρ', 'Aring': 'Å', 'Nu': 'Ν', 'Xi': 'Ξ', 'ndash': '–', 'lsquo': '‘', 'ang': '∠', 'Oacute': 'Ó', 'atilde': 'ã', 'larr': '←', 'part': '∂', 'zwj': '\u200d', 'prop': '∝', 'ograve': 'ò', 'sdot': '⋅', 'aelig': 'æ', 'egrave': 'è', 'Pi': 'Π', 'Iacute': 'Í', 'diams': '♦', 'delta': 'δ', 'ccedil': 'ç', 'gt': '>', 'iuml': 'ï', 'darr': '↓', 'sup3': '³', 'sigmaf': 'ς', 'Uuml': 'Ü', 'Ntilde': 'Ñ', 'permil': '‰', 'Ugrave': 'Ù', 'bdquo': '„', 'cedil': '¸', 'Acirc': 'Â', 'iquest': '¿', 'image': 'ℑ', 'OElig': 'Œ', 'rfloor': '⌋', 'iexcl': '¡', 'and': '∧', 'hellip': '…', 'uml': '¨', 'ni': '∋', 'plusmn': '±', 'nabla': '∇', 'amp': '&', 'ne': '≠', 'minus': '−', 'lang': '〈', 'rdquo': '”', 'Omicron': 'Ο', 'Aacute': 'Á', 'shy': '\xad', 'ETH': 'Ð', 'otimes': '⊗', 'scaron': 'š', 'there4': '∴', 'pound': '£', 'Ouml': 'Ö', 'rsaquo': '›', 'raquo': '»', 'lArr': '⇐', 'lowast': '∗', 'ldquo': '“', 'Prime': '″', 'Theta': 'Θ', 'lsaquo': '‹', 'yacute': 'ý', 'Yuml': 'Ÿ', 'Ecirc': 'Ê', 'Lambda': 'Λ', 'Gamma': 'Γ', 'mdash': '—', 'Oslash': 'Ø', 'Igrave': 'Ì', 'fnof': 'ƒ', 'uuml': 'ü', 'Scaron': 'Š', 'supe': '⊇', 'Yacute': 'Ý', 'laquo': '«', 'micro': 'µ', 'epsilon': 'ε', 'rceil': '⌉', 'circ': 'ˆ', 'icirc': 'î', 'exist': '∃', 'ocirc': 'ô', 'Upsilon': 'Υ', 'prod': '∏', 'lfloor': '⌊', 'uarr': '↑', 'ntilde': 'ñ', 'oelig': 'œ', 'Auml': 'Ä', 'acute': '´', 'hearts': '♥', 'euro': '€', 'piv': 'ϖ', 'iacute': 'í', 'infin': '∞', 'cong': '≅', 'asymp': '≈', 'lt': '<', 'int': '∫', 'times': '×', 'nsub': '⊄', 'Icirc': 'Î', 'cap': '∩', 'sup': '⊃', 'prime': '′', 'Uacute': 'Ú', 'Epsilon': 'Ε', 'weierp': '℘', 'Phi': 'Φ', 'Ograve': 'Ò', 'kappa': 'κ', 'Tau': 'Τ', 'pi': 'π', 'szlig': 'ß', 'tau': 'τ', 'mu': 'μ', 'ecirc': 'ê', 'agrave': 'à', 'eacute': 'é', 'quot': '"', 'le': '≤', 'nbsp': '\xa0', 'forall': '∀', 'Chi': 'Χ', 'yuml': 'ÿ', 'emsp': '\u2003', 'perp': '⊥', 'Kappa': 'Κ', 'lrm': '\u200e', 'cup': '∪', 'upsilon': 'υ', 'dArr': '⇓', 'Dagger': '‡', 'chi': 'χ', 'Ccedil': 'Ç', 'rho': 'ρ', 'igrave': 'ì', 'auml': 'ä', 'phi': 'φ', 'deg': '°', 'Mu': 'Μ', 'reg': '®', 'THORN': 'Þ', 'frasl': '⁄', 'Iota': 'Ι', 'sum': '∑', 'frac12': '½', 'zwnj': '\u200c', 'zeta': 'ζ', 'oplus': '⊕', 'ensp': '\u2002', 'rang': '〉', 'hArr': '⇔', 'sigma': 'σ', 'Sigma': 'Σ', 'Otilde': 'Õ', 'Atilde': 'Ã', 'para': '¶', 'trade': '™', 'rarr': '→', 'frac14': '¼', 'sbquo': '‚', 'Alpha': 'Α', 'sim': '∼', 'not': '¬', 'eth': 'ð', 'ordf': 'ª', 'ordm': 'º', 'sup2': '²', 'rArr': '⇒', 'Agrave': 'À', 'aring': 'å', 'macr': '¯', 'empty': '∅', 'oline': '‾', 'sect': '§', 'lceil': '⌈', 'aacute': 'á', 'acirc': 'â', 'tilde': '˜', 'rsquo': '’', 'sub': '⊂', 'Delta': 'Δ', 'cent': '¢', 'divide': '÷', 'middot': '·', 'ucirc': 'û', 'equiv': '≡', 'upsih': 'ϒ', 'ouml': 'ö', 'or': '∨', 'yen': '¥', 'crarr': '↵', 'nu': 'ν', 'euml': 'ë', 'psi': 'ψ', 'omicron': 'ο', 'Psi': 'Ψ', 'real': 'ℜ', 'dagger': '†', 'copy': '©', 'omega': 'ω', 'gamma': 'γ', 'oslash': 'ø', 'oacute': 'ó', 'sube': '⊆', 'alpha': 'α', 'Egrave': 'È', 'thetasym': 'ϑ', 'ugrave': 'ù', 'Zeta': 'Ζ', 'thinsp': '\u2009', 'Iuml': 'Ï', 'Beta': 'Β', 'uacute': 'ú', 'eta': 'η', 'curren': '¤', 'frac34': '¾', 'Ocirc': 'Ô', 'brvbar': '¦', 'Omega': 'Ω', 'clubs': '♣', 'loz': '◊', 'theta': 'θ', 'Ucirc': 'Û', 'alefsym': 'ℵ', 'sup1': '¹', 'thorn': 'þ', 'radic': '√', 'iota': 'ι', 'uArr': '⇑', 'harr': '↔', 'isin': '∈', 'Euml': 'Ë', 'otilde': 'õ', 'lambda': 'λ', 'notin': '∉'})[source]

Translate one entity to its ISO Latin value. Inspired by example from effbot.org

nltk.sem.relextract.extract_rels(subjclass, objclass, doc, corpus='ace', pattern=None, window=10)[source]

Filter the output of semi_rel2reldict according to specified NE classes and a filler pattern.

The parameters subjclass and objclass can be used to restrict the Named Entities to particular types (any of ‘LOCATION’, ‘ORGANIZATION’, ‘PERSON’, ‘DURATION’, ‘DATE’, ‘CARDINAL’, ‘PERCENT’, ‘MONEY’, ‘MEASURE’).

Parameters:
  • subjclass (str) – the class of the subject Named Entity.
  • objclass (str) – the class of the object Named Entity.
  • doc (ieer document or a list of chunk trees) – input document
  • corpus (str) – name of the corpus to take as input; possible values are ‘ieer’ and ‘conll2002’
  • pattern (SRE_Pattern) – a regular expression for filtering the fillers of retrieved triples.
  • window (int) – filters out fillers which exceed this threshold
Returns:

see mk_reldicts

Return type:

list(defaultdict)

nltk.sem.relextract.ieer_headlines()[source]
nltk.sem.relextract.in_demo(trace=0, sql=True)[source]

Select pairs of organizations and locations whose mentions occur with an intervening occurrence of the preposition “in”.

If the sql parameter is set to True, then the entity pairs are loaded into an in-memory database, and subsequently pulled out using an SQL “SELECT” query.

nltk.sem.relextract.list2sym(lst)[source]

Convert a list of strings into a canonical symbol. :type lst: list :return: a Unicode string without whitespace :rtype: unicode

nltk.sem.relextract.ne_chunked()[source]
nltk.sem.relextract.roles_demo(trace=0)[source]
nltk.sem.relextract.rtuple(reldict, lcon=False, rcon=False)[source]

Pretty print the reldict as an rtuple. :param reldict: a relation dictionary :type reldict: defaultdict

nltk.sem.relextract.semi_rel2reldict(pairs, window=5, trace=False)[source]

Converts the pairs generated by tree2semi_rel into a ‘reldict’: a dictionary which stores information about the subject and object NEs plus the filler between them. Additionally, a left and right context of length =< window are captured (within a given input sentence).

Parameters:
  • pairs – a pair of list(str) and Tree, as generated by
  • window (int) – a threshold for the number of items to include in the left and right context
Returns:

‘relation’ dictionaries whose keys are ‘lcon’, ‘subjclass’, ‘subjtext’, ‘subjsym’, ‘filler’, objclass’, objtext’, ‘objsym’ and ‘rcon’

Return type:

list(defaultdict)

nltk.sem.relextract.tree2semi_rel(tree)[source]

Group a chunk structure into a list of ‘semi-relations’ of the form (list(str), Tree).

In order to facilitate the construction of (Tree, string, Tree) triples, this identifies pairs whose first member is a list (possibly empty) of terminal strings, and whose second member is a Tree of the form (NE_label, terminals).

Parameters:tree – a chunk tree
Returns:a list of pairs (list(str), Tree)
Return type:list of tuple

nltk.sem.skolemize module

nltk.sem.skolemize.skolemize(expression, univ_scope=None, used_variables=None)[source]

Skolemize the expression and convert to conjunctive normal form (CNF)

nltk.sem.skolemize.to_cnf(first, second)[source]

Convert this split disjunction to conjunctive normal form (CNF)

nltk.sem.util module

Utility functions for batch-processing sentences: parsing and extraction of the semantic representation of the root node of the the syntax tree, followed by evaluation of the semantic representation in a first-order model.

nltk.sem.util.demo()[source]
nltk.sem.util.demo_legacy_grammar()[source]

Check that interpret_sents() is compatible with legacy grammars that use a lowercase ‘sem’ feature.

Define ‘test.fcfg’ to be the following

nltk.sem.util.demo_model0()[source]
nltk.sem.util.evaluate_sents(inputs, grammar, model, assignment, trace=0)[source]

Add the truth-in-a-model value to each semantic representation for each syntactic parse of each input sentences.

Parameters:
  • inputs (list(str)) – a list of sentences
  • grammar (nltk.grammar.FeatureGrammar) – FeatureGrammar or name of feature-based grammar
Returns:

a mapping from sentences to lists of triples (parse-tree, semantic-representations, evaluation-in-model)

Return type:

list(list(tuple(nltk.tree.Tree, nltk.sem.logic.ConstantExpression, bool or dict(str): bool)))

nltk.sem.util.interpret_sents(inputs, grammar, semkey='SEM', trace=0)[source]

Add the semantic representation to each syntactic parse tree of each input sentence.

Parameters:
  • inputs (list(str)) – a list of sentences
  • grammar (nltk.grammar.FeatureGrammar) – FeatureGrammar or name of feature-based grammar
Returns:

a mapping from sentences to lists of pairs (parse-tree, semantic-representations)

Return type:

list(list(tuple(nltk.tree.Tree, nltk.sem.logic.ConstantExpression)))

nltk.sem.util.parse_sents(inputs, grammar, trace=0)[source]

Convert input sentences into syntactic trees.

Parameters:
  • inputs (list(str)) – sentences to be parsed
  • grammar (nltk.grammar.FeatureGrammar) – FeatureGrammar or name of feature-based grammar
Return type:

list(nltk.tree.Tree) or dict(list(str)): list(Tree)

Returns:

a mapping from input sentences to a list of ``Tree``s

nltk.sem.util.read_sents(filename, encoding='utf8')[source]
nltk.sem.util.root_semrep(syntree, semkey='SEM')[source]

Find the semantic representation at the root of a tree.

Parameters:
  • syntree – a parse Tree
  • semkey – the feature label to use for the root semantics in the tree
Returns:

the semantic representation at the root of a Tree

Return type:

sem.Expression

Module contents

NLTK Semantic Interpretation Package

This package contains classes for representing semantic structure in formulas of first-order logic and for evaluating such formulas in set-theoretic models.

>>> from nltk.sem import logic
>>> logic._counter._value = 0

The package has two main components:

  • logic provides support for analyzing expressions of First Order Logic (FOL).
  • evaluate allows users to recursively determine truth in a model for formulas of FOL.

A model consists of a domain of discourse and a valuation function, which assigns values to non-logical constants. We assume that entities in the domain are represented as strings such as 'b1', 'g1', etc. A Valuation is initialized with a list of (symbol, value) pairs, where values are entities, sets of entities or sets of tuples of entities. The domain of discourse can be inferred from the valuation, and model is then created with domain and valuation as parameters.

>>> from nltk.sem import Valuation, Model
>>> v = [('adam', 'b1'), ('betty', 'g1'), ('fido', 'd1'),
... ('girl', set(['g1', 'g2'])), ('boy', set(['b1', 'b2'])),
... ('dog', set(['d1'])),
... ('love', set([('b1', 'g1'), ('b2', 'g2'), ('g1', 'b1'), ('g2', 'b1')]))]
>>> val = Valuation(v)
>>> dom = val.domain
>>> m = Model(dom, val)