gensim.similarities.MatrixSimilarity

class gensim.similarities.MatrixSimilarity(corpus, num_best=None, dtype=<type 'numpy.float32'>, num_features=None, chunksize=256, corpus_len=None)[source]

Compute similarity against a corpus of documents by storing the index matrix in memory. The similarity measure used is cosine between two vectors.

Use this if your input corpus contains dense vectors (such as documents in LSI space) and fits into RAM.

The matrix is internally stored as a dense numpy array. Unless the entire matrix fits into main memory, use Similarity instead.

See also Similarity and SparseMatrixSimilarity in this module.

Methods

__init__(corpus[, num_best, dtype, ...]) num_features is the number of features in the corpus (will be determined
get_similarities(query) Return similarity of sparse vector query to all documents in the corpus, as a numpy array.
load(fname[, mmap]) Load a previously saved object from file (also see save).
save(fname_or_handle[, separately, ...]) Save the object to file (also see load).