`gensim.models.LogEntropyModel`¶

class gensim.models.LogEntropyModel(corpus, id2word=None, normalize=True)[source]¶

Objects of this class realize the transformation between word-document co-occurence matrix (integers) into a locally/globally weighted matrix (positive floats).

This is done by a log entropy normalization, optionally normalizing the resulting documents to unit length. The following formulas explain how to compute the log entropy weight for term i in document j:

local_weight_{i,j} = log(frequency_{i,j} + 1)

P_{i,j} = frequency_{i,j} / sum_j frequency_{i,j}

                      sum_j P_{i,j} * log(P_{i,j})
global_weight_i = 1 + ----------------------------
                      log(number_of_documents + 1)

final_weight_{i,j} = local_weight_{i,j} * global_weight_i

The main methods are:

constructor, which calculates the global weighting for all terms in

a corpus.
the [] method, which transforms a simple count representation into the

log entropy normalized space.

>>> log_ent = LogEntropyModel(corpus)
>>> print(log_ent[some_doc])
>>> log_ent.save('/tmp/foo.log_ent_model')

Model persistency is achieved via its load/save methods.

Methods¶

`__init__`(corpus[, id2word, normalize])	normalize dictates whether the resulting vectors will be
`initialize`(corpus)	Initialize internal statistics based on a training corpus.
`load`(fname[, mmap])	Load a previously saved object from file (also see save).
`save`(fname_or_handle[, separately, ...])	Save the object to file (also see load).