gensim.corpora.Dictionary.from_corpus

static Dictionary.from_corpus(corpus, id2word=None)[source]

Create Dictionary from an existing corpus. This can be useful if you only have a term-document BOW matrix (represented by corpus), but not the original text corpus.

This will scan the term-document count matrix for all word ids that appear in it, then construct and return Dictionary which maps each word_id -> id2word[word_id].

id2word is an optional dictionary that maps the word_id to a token. In case id2word isn’t specified the mapping id2word[word_id] = str(word_id) will be used.