The Gaussian EM clusterer models the vectors as being produced by a
mixture of k Gaussian sources. The parameters of these sources (prior
probability, mean and covariance matrix) are then found to maximise the
likelihood of the given data. This is done with the expectation
maximisation algorithm. It starts with k arbitrarily chosen means, priors
and covariance matrices. It then calculates the membership probabilities
for each vector in each of the clusters; this is the 'E' step. The
cluster parameters are then updated in the 'M' step using the maximum
likelihood estimate from the cluster membership probabilities. This
process continues until the likelihood of the data does not significantly
increase.
|
__init__(self,
initial_means,
priors=None,
covariance_matrices=None,
conv_threshold=1e-06,
bias=0.1,
normalise=False,
svd_dimensions=None)
Creates an EM clusterer with the given starting parameters,
convergence threshold and vector mangling parameters. |
source code
|
|
|
|
|
|
|
|
|
|
|
|
|
_loglikelihood(self,
vectors,
priors,
means,
covariances) |
source code
|
|
|
|
Inherited from util.VectorSpace :
classify ,
cluster ,
likelihood ,
vector
Inherited from api.ClusterI :
classification_probdist ,
cluster_name ,
cluster_names
|