nltk.HiddenMarkovModelTrainer.train_unsupervised

HiddenMarkovModelTrainer.train_unsupervised(unlabeled_sequences, update_outputs=True, **kwargs)[source]

Trains the HMM using the Baum-Welch algorithm to maximise the probability of the data sequence. This is a variant of the EM algorithm, and is unsupervised in that it doesn’t need the state sequences for the symbols. The code is based on ‘A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition’, Lawrence Rabiner, IEEE, 1989.

Returns:the trained model
Return type:HiddenMarkovModelTagger
Parameters:unlabeled_sequences (list) – the training data, a set of sequences of observations

kwargs may include following parameters:

Parameters:
  • model – a HiddenMarkovModelTagger instance used to begin the Baum-Welch algorithm
  • max_iterations – the maximum number of EM iterations
  • convergence_logprob – the maximum change in log probability to allow convergence