gensim.models.HdpModel

class gensim.models.HdpModel(corpus, id2word, max_chunks=None, max_time=None, chunksize=256, kappa=1.0, tau=64.0, K=15, T=150, alpha=1, gamma=1, eta=0.01, scale=1.0, var_converge=0.0001, outputdir=None)[source]

The constructor estimates Hierachical Dirichlet Process model parameters based on a training corpus:

>>> hdp = HdpModel(corpus, id2word)
>>> hdp.print_topics(show_topics=20, num_words=10)

Inference on new documents is based on the approximately LDA-equivalent topics.

Model persistency is achieved through its load/save methods.

Methods

__init__(corpus, id2word[, max_chunks, ...]) gamma: first level concentration
doc_e_step(doc, ss, Elogsticks_1st, ...) e step for a single doc
evaluate_test_corpus(corpus)
hdp_to_lda() Compute the LDA almost equivalent HDP.
inference(chunk)
load(fname[, mmap]) Load a previously saved object from file (also see save).
optimal_ordering() ordering the topics
print_topics([num_topics, num_words]) Alias for show_topics() that prints the num_words most probable words for topics number of topics to log.
save(fname_or_handle[, separately, ...]) Save the object to file (also see load).
save_options() legacy method; use self.save() instead
save_topics([doc_count]) legacy method; use self.save() instead
show_topics([num_topics, num_words, log, ...]) Print the num_words most probable words for topics number of topics.
update(corpus)
update_chunk(chunk[, update, opt_o])
update_expectations() Since we’re doing lazy updates on lambda, at any given moment the current state of lambda may not be accurate.
update_finished(start_time, ...)
update_lambda(sstats, word_list, opt_o)