gensim.corpora.UciCorpus

class gensim.corpora.UciCorpus(fname, fname_vocab=None)[source]

Corpus in the UCI bag-of-words format.

Methods

__init__(fname[, fname_vocab])
create_dictionary() Utility method to generate gensim-style Dictionary directly from the corpus and vocabulary data.
docbyoffset(offset) Return document at file offset offset (in bytes)
load(fname[, mmap]) Load a previously saved object from file (also see save).
save(*args, **kwargs)
save_corpus(fname, corpus[, id2word, ...]) Save a corpus in the UCI Bag-of-Words format.
serialize(serializer, fname, corpus[, ...]) Iterate through the document stream corpus, saving the documents to fname and recording byte offset of each document.
skip_headers(input_file)