gensim.models

This package contains algorithms for extracting document representations from their raw bag-of-word counts.

Classes

CoherenceModel([model, topics, texts, ...]) Objects of this class allow for building and maintaining a model for topic coherence.
Doc2Vec([documents, size, alpha, window, ...]) Class for training, using and evaluating neural networks described in http://arxiv.org/pdf/1405.4053v2.pdf
HdpModel(corpus, id2word[, max_chunks, ...]) The constructor estimates Hierachical Dirichlet Process model parameters based
LdaModel([corpus, num_topics, id2word, ...]) The constructor estimates Latent Dirichlet Allocation model parameters based
LdaMulticore([corpus, num_topics, id2word, ...]) The constructor estimates Latent Dirichlet Allocation model parameters based
LogEntropyModel(corpus[, id2word, normalize]) Objects of this class realize the transformation between word-document co-occurence matrix (integers) into a locally/globally weighted matrix (positive floats).
LsiModel([corpus, num_topics, id2word, ...]) Objects of this class allow building and maintaining a model for Latent Semantic Indexing (also known as Latent Semantic Analysis).
NormModel([corpus, norm]) Objects of this class realize the explicit normalization of vectors.
Phrases([sentences, min_count, threshold, ...]) Detect phrases, based on collected collocation counts.
RpModel(corpus[, id2word, num_topics]) Objects of this class allow building and maintaining a model for Random Projections (also known as Random Indexing).
TfidfModel([corpus, id2word, dictionary, ...]) Objects of this class realize the transformation between word-document co-occurrence matrix (integers) into a locally/globally weighted TF_IDF matrix (positive floats).
VocabTransform(old2new[, id2token]) Remap feature ids to new values.
Word2Vec([sentences, size, alpha, window, ...]) Class for training, using and evaluating neural networks described in https://code.google.com/p/word2vec/