Package nltk :: Module featstruct
[hide private]
[frames] | no frames]

Module featstruct

source code

Basic data classes for representing feature structures, and for performing basic operations on those feature structures. A feature structure is a mapping from feature identifiers to feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure. There are two types of feature structure, implemented by two subclasses of FeatStruct:

Feature structures are typically used to represent partial information about objects. A feature identifier that is not mapped to a value stands for a feature whose value is unknown (not a feature without a value). Two feature structures that represent (potentially overlapping) information about the same object can be combined by unification. When two inconsistent feature structures are unified, the unification fails and returns None.

Features can be specified using feature paths, or tuples of feature identifiers that specify path through the nested feature structures to a value. Feature structures may contain reentrant feature values. A reentrant feature value is a single feature value that can be accessed via multiple feature paths. Unification preserves the reentrance relations imposed by both of the unified feature structures. In the feature structure resulting from unification, any modifications to a reentrant feature value will be visible using any of its feature paths.

Feature structure variables are encoded using the nltk.sem.Variable class. The variables' values are tracked using a bindings dictionary, which maps variables to their values. When two feature structures are unified, a fresh bindings dictionary is created to track their values; and before unification completes, all bound variables are replaced by their values. Thus, the bindings dictionaries are usually strictly internal to the unification process. However, it is possible to track the bindings of variables if you choose to, by supplying your own initial bindings dictionary to the unify() function.

When unbound variables are unified with one another, they become aliased. This is encoded by binding one variable to the other.

Lightweight Feature Structures

Many of the functions defined by nltk.featstruct can be applied directly to simple Python dictionaries and lists, rather than to full-fledged FeatDict and FeatList objects. In other words, Python dicts and lists can be used as "light-weight" feature structures.

>>> from nltk.featstruct import unify
>>> unify(dict(x=1, y=dict()), dict(a='a', y=dict(b='b')))
{'y': {'b': 'b'}, 'x': 1, 'a': 'a'}

However, you should keep in mind the following caveats:

In general, if your feature structures will contain any reentrances, or if you plan to use them as dictionary keys, it is strongly recommended that you use full-fledged FeatStruct objects.

Classes [hide private]
  FeatStruct
A mapping from feature identifiers to feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure.
  FeatDict
A feature structure that acts like a Python dictionary.
  FeatList
A list of feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure.
  _UnificationFailure
  _UnificationFailureError
An exception that is used by _destructively_unify to abort unification when a failure is encountered.
  SubstituteBindingsSequence
A mixin class for sequence clases that distributes variables() and substitute_bindings() over the object's elements.
  FeatureValueTuple
A base feature value that is a tuple of other base feature values.
  FeatureValueSet
A base feature value that is a set of other base feature values.
  FeatureValueUnion
A base feature value that represents the union of two or more FeatureValueSets or Variables.
  FeatureValueConcat
A base feature value that represents the concatenation of two or more FeatureValueTuples or Variables.
  Feature
A feature identifier that's specialized to put additional constraints, default values, etc.
  SlashFeature
  RangeFeature
  CustomFeatureValue
An abstract base class for base values that define a custom unification method.
  FeatStructParser
Functions [hide private]
 
_check_frozen(method, indent='')
Given a method function, return a new method function that first checks if self._frozen is true; and if so, raises ValueError with an appropriate message.
source code
 
substitute_bindings(fstruct, bindings, fs_class='default')
Returns: The feature structure that is obtained by replacing each variable bound by bindings with its binding.
source code
 
_substitute_bindings(fstruct, bindings, fs_class, visited) source code
 
retract_bindings(fstruct, bindings, fs_class='default')
Returns: The feature structure that is obtained by replacing each feature structure value that is bound by bindings with the variable that binds it.
source code
 
_retract_bindings(fstruct, inv_bindings, fs_class, visited) source code
set of Variable
find_variables(fstruct, fs_class='default')
Returns: The set of variables used by this feature structure.
source code
 
_variables(fstruct, vars, fs_class, visited) source code
 
rename_variables(fstruct, vars=None, used_vars=(), new_vars=None, fs_class='default')
Returns: The feature structure that is obtained by replacing any of this feature structure's variables that are in vars with new variables.
source code
 
_rename_variables(fstruct, vars, used_vars, new_vars, fs_class, visited) source code
 
_rename_variable(var, used_vars) source code
FeatStruct
remove_variables(fstruct, fs_class='default')
Returns: The feature structure that is obtained by deleting all features whose values are Variables.
source code
 
_remove_variables(fstruct, fs_class, visited) source code
 
unify(fstruct1, fstruct2, bindings=None, trace=False, fail=None, rename_vars=True, fs_class='default')
Unify fstruct1 with fstruct2, and return the resulting feature structure.
source code
 
_destructively_unify(fstruct1, fstruct2, bindings, forward, trace, fail, fs_class, path)
Attempt to unify fstruct1 and fstruct2 by modifying them in-place.
source code
 
_unify_feature_values(fname, fval1, fval2, bindings, forward, trace, fail, fs_class, fpath)
Attempt to unify fval1 and and fval2, and return the resulting unified value.
source code
 
_apply_forwards_to_bindings(forward, bindings)
Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).
source code
 
_apply_forwards(fstruct, forward, fs_class, visited)
Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).
source code
 
_resolve_aliases(bindings)
Replace any bound aliased vars with their binding; and replace any unbound aliased vars with their representative var.
source code
 
_trace_unify_start(path, fval1, fval2) source code
 
_trace_unify_identity(path, fval1) source code
 
_trace_unify_fail(path, result) source code
 
_trace_unify_succeed(path, fval1) source code
 
_trace_bindings(path, bindings) source code
 
_trace_valrepr(val) source code
 
subsumes(fstruct1, fstruct2)
Returns: True if fstruct1 subsumes fstruct2.
source code
list of tuple
conflicts(fstruct1, fstruct2, trace=0)
Returns: A list of the feature paths of all features which are assigned incompatible values by fstruct1 and fstruct2.
source code
 
_is_mapping(v) source code
 
_is_sequence(v) source code
 
_default_fs_class(obj) source code
 
_flatten(lst, cls)
Helper function -- return a copy of list, with all elements of type cls spliced in rather than appended in.
source code
    Demo
 
display_unification(fs1, fs2, indent=' ') source code
 
interactivedemo(trace=False) source code
 
demo(trace=False)
Just for testing
source code
Variables [hide private]
  _FROZEN_ERROR = 'Frozen FeatStructs may not be modified.'
  _FROZEN_NOTICE = '\n%sIf self is frozen, raise ValueError.'
  UnificationFailure = nltk.featstruct.UnificationFailure
A unique value used to indicate unification failure.
  SLASH = *slash*
  TYPE = *type*
Function Details [hide private]

_check_frozen(method, indent='')

source code 

Given a method function, return a new method function that first checks if self._frozen is true; and if so, raises ValueError with an appropriate message. Otherwise, call the method and return its result.

substitute_bindings(fstruct, bindings, fs_class='default')

source code 
Parameters:
  • bindings (dict with Variable keys) - A dictionary mapping from variables to values.
Returns:
The feature structure that is obtained by replacing each variable bound by bindings with its binding. If a variable is aliased to a bound variable, then it will be replaced by that variable's value. If a variable is aliased to an unbound variable, then it will be replaced by that variable.

retract_bindings(fstruct, bindings, fs_class='default')

source code 
Returns:
The feature structure that is obtained by replacing each feature structure value that is bound by bindings with the variable that binds it. A feature structure value must be identical to a bound value (i.e., have equal id) to be replaced.

bindings is modified to point to this new feature structure, rather than the original feature structure. Feature structure values in bindings may be modified if they are contained in fstruct.

find_variables(fstruct, fs_class='default')

source code 
Returns: set of Variable
The set of variables used by this feature structure.

rename_variables(fstruct, vars=None, used_vars=(), new_vars=None, fs_class='default')

source code 
Parameters:
  • vars (set) - The set of variables that should be renamed. If not specified, find_variables(fstruct) is used; i.e., all variables will be given new names.
  • used_vars (set) - A set of variables whose names should not be used by the new variables.
  • new_vars (dict from Variable to Variable) - A dictionary that is used to hold the mapping from old variables to new variables. For each variable v in this feature structure:
    • If new_vars maps v to v', then v will be replaced by v'.
    • If new_vars does not contain v, but vars does contain v, then a new entry will be added to new_vars, mapping v to the new variable that is used to replace it.

    To consistantly rename the variables in a set of feature structures, simply apply rename_variables to each one, using the same dictionary:

    >>> new_vars = {}  # Maps old vars to alpha-renamed vars
    >>> new_fstruct1 = fstruct1.rename_variables(new_vars=new_vars)
    >>> new_fstruct2 = fstruct2.rename_variables(new_vars=new_vars)
    >>> new_fstruct3 = fstruct3.rename_variables(new_vars=new_vars)

    If new_vars is not specified, then an empty dictionary is used.

Returns:
The feature structure that is obtained by replacing any of this feature structure's variables that are in vars with new variables. The names for these new variables will be names that are not used by any variable in vars, or in used_vars, or in this feature structure.

remove_variables(fstruct, fs_class='default')

source code 
Returns: FeatStruct
The feature structure that is obtained by deleting all features whose values are Variables.

unify(fstruct1, fstruct2, bindings=None, trace=False, fail=None, rename_vars=True, fs_class='default')

source code 

Unify fstruct1 with fstruct2, and return the resulting feature structure. This unified feature structure is the minimal feature structure that:

  • contains all feature value assignments from both fstruct1 and fstruct2.
  • preserves all reentrance properties of fstruct1 and fstruct2.

If no such feature structure exists (because fstruct1 and fstruct2 specify incompatible values for some feature), then unification fails, and unify returns None.

Parameters:
  • bindings (dict with Variable keys) - A set of variable bindings to be used and updated during unification.

    Bound variables are replaced by their values. Aliased variables are replaced by their representative variable (if unbound) or the value of their representative variable (if bound). I.e., if variable v is in bindings, then v is replaced by bindings[v]. This will be repeated until the variable is replaced by an unbound variable or a non-variable value.

    Unbound variables are bound when they are unified with values; and aliased when they are unified with variables. I.e., if variable v is not in bindings, and is unified with a variable or value x, then bindings[v] is set to x.

    If bindings is unspecified, then all variables are assumed to be unbound. I.e., bindings defaults to an empty dict.

  • trace (bool) - If true, generate trace output.
  • rename_vars (bool) - If true, then rename any variables in fstruct2 that are also used in fstruct1. This prevents aliasing in cases where fstruct1 and fstruct2 use the same variable name. E.g.:
    >>> FeatStruct('[a=?x]').unify(FeatStruct('[b=?x]'))
    [a=?x, b=?x2]

    If you intend for a variables in fstruct1 and fstruct2 with the same name to be treated as a single variable, use rename_vars=False.

_destructively_unify(fstruct1, fstruct2, bindings, forward, trace, fail, fs_class, path)

source code 

Attempt to unify fstruct1 and fstruct2 by modifying them in-place. If the unification succeeds, then fstruct1 will contain the unified value, the value of fstruct2 is undefined, and forward[id(fstruct2)] is set to fstruct1. If the unification fails, then a _UnificationFailureError is raised, and the values of fstruct1 and fstruct2 are undefined.

Parameters:
  • bindings - A dictionary mapping variables to values.
  • forward - A dictionary mapping feature structures ids to replacement structures. When two feature structures are merged, a mapping from one to the other will be added to the forward dictionary; and changes will be made only to the target of the forward dictionary. _destructively_unify will always 'follow' any links in the forward dictionary for fstruct1 and fstruct2 before actually unifying them.
  • trace - If true, generate trace output
  • path - The feature path that led us to this unification step. Used for trace output.

_unify_feature_values(fname, fval1, fval2, bindings, forward, trace, fail, fs_class, fpath)

source code 

Attempt to unify fval1 and and fval2, and return the resulting unified value. The method of unification will depend on the types of fval1 and fval2:

  1. If they're both feature structures, then destructively unify them (see _destructively_unify().
  2. If they're both unbound variables, then alias one variable to the other (by setting bindings[v2]=v1).
  3. If one is an unbound variable, and the other is a value, then bind the unbound variable to the value.
  4. If one is a feature structure, and the other is a base value, then fail.
  5. If they're both base values, then unify them. By default, this will succeed if they are equal, and fail otherwise.

subsumes(fstruct1, fstruct2)

source code 
Returns:
True if fstruct1 subsumes fstruct2. I.e., return true if unifying fstruct1 with fstruct2 would result in a feature structure equal to fstruct2.

conflicts(fstruct1, fstruct2, trace=0)

source code 
Returns: list of tuple
A list of the feature paths of all features which are assigned incompatible values by fstruct1 and fstruct2.

Variables Details [hide private]

UnificationFailure

A unique value used to indicate unification failure. It can be returned by Feature.unify_base_values() or by custom fail() functions to indicate that unificaiton should fail.

Value:
nltk.featstruct.UnificationFailure