gensim.matutils.corpus2csc()

gensim.matutils.corpus2csc(corpus, num_terms=None, dtype=<type 'numpy.float64'>, num_docs=None, num_nnz=None, printprogress=0)[source]

Convert a streamed corpus into a sparse matrix, in scipy.sparse.csc_matrix format, with documents as columns.

If the number of terms, documents and non-zero elements is known, you can pass them here as parameters and a more memory efficient code path will be taken.

The input corpus may be a non-repeatable stream (generator).

This is the mirror function to Sparse2Corpus.