nltk.LazyZip
¶
-
class
nltk.
LazyZip
(*lists)[source]¶ A lazy sequence whose elements are tuples, each containing the i-th element from each of the argument sequences. The returned list is truncated in length to the length of the shortest argument sequence. The tuples are constructed lazily – i.e., when you read a value from the list,
LazyZip
will calculate that value by forming a tuple from the i-th element of each of the argument sequences.LazyZip
is essentially a lazy version of the Python primitive functionzip
. In particular, an evaluated LazyZip is equivalent to a zip:>>> from nltk.util import LazyZip >>> sequence1, sequence2 = [1, 2, 3], ['a', 'b', 'c'] >>> zip(sequence1, sequence2) [(1, 'a'), (2, 'b'), (3, 'c')] >>> list(LazyZip(sequence1, sequence2)) [(1, 'a'), (2, 'b'), (3, 'c')] >>> sequences = [sequence1, sequence2, [6,7,8,9]] >>> list(zip(*sequences)) == list(LazyZip(*sequences)) True
Lazy zips can be useful for conserving memory in cases where the argument sequences are particularly long.
A typical example of a use case for this class is combining long sequences of gold standard and predicted values in a classification or tagging task in order to calculate accuracy. By constructing tuples lazily and avoiding the creation of an additional long sequence, memory usage can be significantly reduced.
Methods¶
__init__ (*lists) |
|
||
count (value) |
Return the number of times this list contains value . |
||
index (value[, start, stop]) |
Return the index of the first occurrence of value in this list that is greater than or equal to start and less than stop . |
||
iterate_from (index) |
|||
unicode_repr () |
Return a string representation for this corpus view that is similar to a list’s representation; but if it would be more than 60 characters long, it is truncated. |