gensim.utils.RepeatCorpus.__init__

RepeatCorpus.__init__(corpus, reps)[source]

Wrap a corpus as another corpus of length reps. This is achieved by repeating documents from corpus over and over again, until the requested length len(result)==reps is reached. Repetition is done on-the-fly=efficiently, via itertools.

>>> corpus = [[(1, 0.5)], []] # 2 documents
>>> list(RepeatCorpus(corpus, 5)) # repeat 2.5 times to get 5 documents
[[(1, 0.5)], [], [(1, 0.5)], [], [(1, 0.5)]]