gensim.interfaces.TransformedCorpus.save_corpus

TransformedCorpus.save_corpus(fname, corpus, id2word=None, metadata=False)

Save an existing corpus to disk.

Some formats also support saving the dictionary (feature_id->word mapping), which can in this case be provided by the optional id2word parameter.

>>> MmCorpus.save_corpus('file.mm', corpus)

Some corpora also support an index of where each document begins, so that the documents on disk can be accessed in O(1) time (see the corpora.IndexedCorpus base class). In this case, save_corpus is automatically called internally by serialize, which does save_corpus plus saves the index at the same time, so you want to store the corpus with:

>>> MmCorpus.serialize('file.mm', corpus) # stores index as well, allowing random access to individual documents

Calling serialize() is preferred to calling save_corpus().