gensim.models.VocabTransform
¶
-
class
gensim.models.
VocabTransform
(old2new, id2token=None)[source]¶ Remap feature ids to new values.
Given a mapping between old ids and new ids (some old ids may be missing = these features are to be discarded), this will wrap a corpus so that iterating over VocabTransform[corpus] returns the same vectors but with the new ids.
Old features that have no counterpart in the new ids are discarded. This can be used to filter vocabulary of a corpus “online”:
>>> old2new = dict((oldid, newid) for newid, oldid in enumerate(ids_you_want_to_keep)) >>> vt = VocabTransform(old2new) >>> for vec_with_new_ids in vt[corpus_with_old_ids]: >>> ...