Package nltk :: Package classify :: Module util
[hide private]
[frames] | no frames]

Module util

source code

Utility functions and classes for classifiers.

Classes [hide private]
    Helper Functions
  CutoffChecker
A helper class that implements cutoff checks based on number of iterations and log likelihood.
Functions [hide private]
    Helper Functions
 
apply_features(feature_func, toks, labeled=None)
Use the LazyMappedList class to construct a lazy list-like object that is analagous to map(feature_func, toks).
source code
list of (immutable)
attested_labels(tokens)
Returns: A list of all labels that are attested in the given list of tokens.
source code
 
log_likelihood(classifier, gold) source code
 
accuracy(classifier, gold) source code
    Demos
 
names_demo_features(name) source code
 
binary_names_demo_features(name) source code
 
names_demo(trainer, features=<function names_demo_features at 0x120e130>) source code
 
wsd_demo(trainer, word, features, n=1000) source code
Variables [hide private]
    Demos
  _inst_cache = {}
Function Details [hide private]

apply_features(feature_func, toks, labeled=None)

source code 

Use the LazyMappedList class to construct a lazy list-like object that is analagous to map(feature_func, toks). In particular, if labeled=False, then the returned list-like object's values are equal to:

   [feature_func(tok) for tok in toks]

If labeled=True, then the returned list-like object's values are equal to:

   [(feature_func(tok), label) for (tok, label) in toks]

The primary purpose of this function is to avoid the memory overhead involved in storing all the featuresets for every token in a corpus. Instead, these featuresets are constructed lazily, as-needed. The reduction in memory overhead can be especially significant when the underlying list of tokens is itself lazy (as is the case with many corpus readers).

Parameters:
  • feature_func - The function that will be applied to each token. It should return a featureset -- i.e., a dict mapping feature names to feature values.
  • toks - The list of tokens to which feature_func should be applied. If labeled=True, then the list elements will be passed directly to feature_func(). If labeled=False, then the list elements should be tuples (tok,label), and tok will be passed to feature_func().
  • labeled - If true, then toks contains labeled tokens -- i.e., tuples of the form (tok, label). (Default: auto-detect based on types.)

attested_labels(tokens)

source code 
Parameters:
  • tokens (list) - The list of classified tokens from which to extract labels. A classified token has the form (token, label).
Returns: list of (immutable)
A list of all labels that are attested in the given list of tokens.