Hidden Markov model class, a generative model for labelling sequence
data. These models define the joint probability of a sequence of symbols
and their labels (state transitions) as the product of the starting state
probability, the probability of each state transition, and the
probability of each observation being generated from each state. This is
described in more detail in the module documentation.
This implementation is based on the HMM description in Chapter 8,
Huang, Acero and Hon, Spoken Language Processing and includes an
extension for training shallow HMM parsers or specializaed HMMs as in
Molina et. al, 2002. A specialized HMM modifies training data by
applying a specialization function to create a new training set that is
more appropriate for sequential tagging with an HMM. A typical use case
is chunking.
|
__init__(self,
symbols,
states,
transitions,
outputs,
priors,
**kwargs)
Creates a hidden markov model parametised by the the states,
transition probabilities, output probabilities and priors. |
source code
|
|
float
|
|
float
|
|
list
|
tag(self,
unlabeled_sequence)
Tags the sequence with the highest probability state sequence. |
source code
|
|
|
|
float
|
|
|
|
|
|
sequence of any
|
best_path(self,
unlabeled_sequence)
Returns the state sequence of the optimal (most probable) path
through the HMM. |
source code
|
|
|
|
sequence of any
|
|
|
_best_path_simple(self,
unlabeled_sequence) |
source code
|
|
list
|
|
|
_sample_probdist(self,
probdist,
p,
samples) |
source code
|
|
|
entropy(self,
unlabeled_sequence)
Returns the entropy over labellings of the given sequence. |
source code
|
|
|
point_entropy(self,
unlabeled_sequence)
Returns the pointwise entropy over the possible states at each
position in the chain, given the observation sequence. |
source code
|
|
|
_exhaustive_entropy(self,
unlabeled_sequence) |
source code
|
|
|
_exhaustive_point_entropy(self,
unlabeled_sequence) |
source code
|
|
array
|
_forward_probability(self,
unlabeled_sequence)
Return the forward probability matrix, a T by N array of
log-probabilities, where T is the length of the sequence and N is the
number of states. |
source code
|
|
array
|
_backward_probability(self,
unlabeled_sequence)
Return the backward probability matrix, a T by N array of
log-probabilities, where T is the length of the sequence and N is the
number of states. |
source code
|
|
|
test(self,
test_sequence,
**kwargs)
Tests the HiddenMarkovModelTagger instance. |
source code
|
|
|
|
Inherited from api.TaggerI :
batch_tag
Inherited from object :
__delattr__ ,
__getattribute__ ,
__hash__ ,
__new__ ,
__reduce__ ,
__reduce_ex__ ,
__setattr__ ,
__str__
|