gensim.models.TfidfModel
¶
-
class
gensim.models.
TfidfModel
(corpus=None, id2word=None, dictionary=None, wlocal=<function identity>, wglobal=<function df2idf>, normalize=True)[source]¶ Objects of this class realize the transformation between word-document co-occurrence matrix (integers) into a locally/globally weighted TF_IDF matrix (positive floats).
The main methods are:
- constructor, which calculates inverse document counts for all terms in the training corpus.
- the [] method, which transforms a simple count representation into the TfIdf space.
>>> tfidf = TfidfModel(corpus) >>> print(tfidf[some_doc]) >>> tfidf.save('/tmp/foo.tfidf_model')
Model persistency is achieved via its load/save methods.
Methods¶
__init__ ([corpus, id2word, dictionary, ...]) |
Compute tf-idf by multiplying a local component (term frequency) with a global component (inverse document frequency), and normalizing the resulting documents to unit length. |
initialize (corpus) |
Compute inverse document weights, which will be used to modify term frequencies for documents. |
load (fname[, mmap]) |
Load a previously saved object from file (also see save). |
save (fname_or_handle[, separately, ...]) |
Save the object to file (also see load). |