gensim.models.VocabTransform

class gensim.models.VocabTransform(old2new, id2token=None)[source]

Remap feature ids to new values.

Given a mapping between old ids and new ids (some old ids may be missing = these features are to be discarded), this will wrap a corpus so that iterating over VocabTransform[corpus] returns the same vectors but with the new ids.

Old features that have no counterpart in the new ids are discarded. This can be used to filter vocabulary of a corpus “online”:

>>> old2new = dict((oldid, newid) for newid, oldid in enumerate(ids_you_want_to_keep))
>>> vt = VocabTransform(old2new)
>>> for vec_with_new_ids in vt[corpus_with_old_ids]:
>>>     ...

Methods

__init__(old2new[, id2token])
load(fname[, mmap]) Load a previously saved object from file (also see save).
save(fname_or_handle[, separately, ...]) Save the object to file (also see load).