nltk.cluster.EMClusterer.__init__

EMClusterer.__init__(initial_means, priors=None, covariance_matrices=None, conv_threshold=1e-06, bias=0.1, normalise=False, svd_dimensions=None)[source]

Creates an EM clusterer with the given starting parameters, convergence threshold and vector mangling parameters.

Parameters:
  • initial_means ([seq of] numpy array or seq of SparseArray) – the means of the gaussian cluster centers
  • priors (numpy array or seq of float) – the prior probability for each cluster
  • covariance_matrices ([seq of] numpy array) – the covariance matrix for each cluster
  • conv_threshold (int or float) – maximum change in likelihood before deemed convergent
  • bias (float) – variance bias used to ensure non-singular covariance matrices
  • normalise (boolean) – should vectors be normalised to length 1
  • svd_dimensions (int) – number of dimensions to use in reducing vector dimensionsionality with SVD