gensim.similarities.WmdSimilarity
¶
-
class
gensim.similarities.
WmdSimilarity
(corpus, w2v_model, num_best=None, normalize_w2v_and_replace=True, chunksize=256)[source]¶ Document similarity (like MatrixSimilarity) that uses the negative of WMD as a similarity measure. See gensim.models.word2vec.wmdistance for more information.
When a num_best value is provided, only the most similar documents are retrieved.
When using this code, please consider citing the following papers:
- Example:
# See Tutorial Notebook for more examples https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/WMD_tutorial.ipynb >>> # Given a document collection “corpus”, train word2vec model. >>> model = word2vec(corpus) >>> instance = WmdSimilarity(corpus, model, num_best=10)
>>> # Make query. >>> query = 'Very good, you should seat outdoor.' >>> sims = instance[query]
Methods¶
__init__ (corpus, w2v_model[, num_best, ...]) |
corpus: List of lists of strings, as in gensim.models.word2vec. |
get_similarities (query) |
Do not use this function directly; use the self[query] syntax instead. |
load (fname[, mmap]) |
Load a previously saved object from file (also see save). |
save (fname_or_handle[, separately, ...]) |
Save the object to file (also see load). |