`gensim.matutils`¶

This module contains math helper functions.

Functions¶

`any2sparse`(vec[, eps])	Convert a numpy/scipy vector into gensim document format (=list of 2-tuples).
`argsort`(x[, topn, reverse])	Return indices of the topn smallest elements in array x, in ascending order.
`blas`(name, ndarray)
`corpus2csc`(corpus[, num_terms, dtype, ...])	Convert a streamed corpus into a sparse matrix, in scipy.sparse.csc_matrix format, with documents as columns.
`corpus2dense`(corpus, num_terms[, num_docs, ...])	Convert corpus into a dense numpy array (documents will be columns).
`cossim`(vec1, vec2)	Return cosine similarity between two sparse vectors.
`dense2vec`(vec[, eps])	Convert a dense numpy array into the sparse document format (sequence of 2-tuples).
`entropy`(pk[, qk, base])	Calculate the entropy of a distribution for given probability values.
`full2sparse`(vec[, eps])	Convert a dense numpy array into the sparse document format (sequence of 2-tuples).
`full2sparse_clipped`(vec, topn[, eps])	Like full2sparse, but only return the topn elements of the greatest magnitude (abs).
`get_lapack_funcs`(names[, arrays, dtype])	Return available LAPACK function objects from names.
`hellinger`(vec1, vec2)	Hellinger distance is a distance metric to quantify the similarity between two probability distributions.
`isbow`(vec)	Checks if vector passed is in bag of words representation or not.
`ismatrix`(m)
`iteritems`(d, **kw)	Return an iterator over the (key, value) pairs of a dictionary.
`itervalues`(d, **kw)	Return an iterator over the values of a dictionary.
`jaccard`(vec1, vec2)	A distance metric between bags of words representation.
`kullback_leibler`(vec1, vec2[, num_features])	A distance metric between two probability distributions.
`pad`(mat, padrow, padcol)	Add additional rows/columns to a numpy.matrix mat.
`qr_destroy`(la)	Return QR decomposition of la[0].
`ret_normalized_vec`(vec, length)
`scipy2sparse`(vec[, eps])	Convert a scipy.sparse vector into gensim document format (=list of 2-tuples).
`sparse2full`(doc, length)	Convert a document in sparse document format (=sequence of 2-tuples) into a dense numpy array (of size length).
`triu`(m[, k])	Make a copy of a matrix with elements below the k-th diagonal zeroed.
`triu_indices`(n[, k, m])	Return the indices for the upper-triangle of an (n, m) array.
`unitvec`(vec[, norm])	Scale a vector to unit length.
`veclen`(vec)
`zeros_aligned`(shape, dtype[, order, align])	Like numpy.zeros(), but the array will be aligned at align byte boundary.

Classes¶

`Dense2Corpus`(dense[, documents_columns])	Treat dense numpy array as a sparse, streamed gensim corpus.
`MmReader`(input[, transposed])	Wrap a term-document matrix on disk (in matrix-market format), and present it as an object which supports iteration over the rows (~documents).
`MmWriter`(fname)	Store a corpus in Matrix Market format.
`Scipy2Corpus`(vecs)	Convert a sequence of dense/sparse vectors into a streamed gensim corpus object.
`Sparse2Corpus`(sparse[, documents_columns])	Convert a matrix in scipy.sparse format into a streaming gensim corpus.
`izip`	izip(iter1 [,iter2 [...]]) –> izip object
`xrange`	xrange(start, stop[, step]) -> xrange object