gensim.corpora.LowCorpus.__init__

LowCorpus.__init__(fname, id2word=None, line2words=<function split_on_space>)[source]

Initialize the corpus from a file.

id2word and line2words are optional parameters. If provided, id2word is a dictionary mapping between word_ids (integers) and words (strings). If not provided, the mapping is constructed from the documents.

line2words is a function which converts lines into tokens. Defaults to simple splitting on spaces.