The K-means clusterer starts with k arbitrary chosen means then
allocates each vector to the cluster with the closest mean. It then
recalculates the means of each cluster as the centroid of the vectors in
the cluster. This process repeats until the cluster memberships
stabilise. This is a hill-climbing algorithm which may converge to a
local maximum. Hence the clustering is often repeated with random initial
means and the most commonly occuring output means are chosen.
|
__init__(self,
num_means,
distance,
repeats=1,
conv_test=1e-06,
initial_means=None,
normalise=False,
svd_dimensions=None,
rng=None) |
source code
|
|
|
|
|
_cluster_vectorspace(self,
vectors,
trace=False) |
source code
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Inherited from util.VectorSpace :
classify ,
cluster ,
likelihood ,
likelihood_vectorspace ,
vector
Inherited from api.ClusterI :
classification_probdist ,
cluster_name ,
cluster_names
|