gensim.corpora.ShardedCorpus.resize_shards

ShardedCorpus.resize_shards(shardsize)[source]

Re-process the dataset to new shard size. This may take pretty long. Also, note that you need some space on disk for this one (we’re assuming there is enough disk space for double the size of the dataset and that there is enough memory for old + new shardsize).

Parameters:shardsize (int) – The new shard size.