gensim.corpora.MmCorpus

class gensim.corpora.MmCorpus(fname)[source]

Corpus in the Matrix Market format.

Methods

__init__(fname)
docbyoffset(offset) Return document at file offset offset (in bytes)
load(fname[, mmap]) Load a previously saved object from file (also see save).
save(*args, **kwargs)
save_corpus(fname, corpus[, id2word, ...]) Save a corpus in the Matrix Market format to disk.
serialize(serializer, fname, corpus[, ...]) Iterate through the document stream corpus, saving the documents to fname and recording byte offset of each document.
skip_headers(input_file) Skip file headers that appear before the first document.