nltk.AnnotationTask

class nltk.AnnotationTask(data=None, distance=<function binary_distance>)[source]

Represents an annotation task, i.e. people assign labels to items.

Notation tries to match notation in Artstein and Poesio (2007).

In general, coders and items can be represented as any hashable object. Integers, for example, are fine, though strings are more readable. Labels must support the distance functions applied to them, so e.g. a string-edit-distance makes no sense if your labels are integers, whereas interval distance needs numeric values. A notable case of this is the MASI metric, which requires Python sets.

Methods

Ae_kappa(cA, cB)
Ao(cA, cB) Observed agreement between two coders on all items.
Do_Kw([max_distance]) Averaged over all labelers
Do_Kw_pairwise(cA, cB[, max_distance]) The observed disagreement for the weighted kappa coefficient.
Do_alpha() The observed disagreement for the alpha coefficient.
N(*args, **kwargs) Implements the “n-notation” used in Artstein and Poesio (2007)
Nck(c, k)
Nik(i, k)
Nk(k)
S() Bennett, Albert and Goldstein 1954
__init__([data, distance]) Initialize an empty annotation task.
agr(cA, cB, i[, data]) Agreement between two coders on a given item
alpha() Krippendorff 1980
avg_Ao() Average observed agreement across all coders and items.
kappa() Cohen 1960 Averages naively over kappas for each coder pair.
kappa_pairwise(cA, cB)
load_array(array) Load the results of annotation.
multi_kappa() Davies and Fleiss 1982 Averages over observed and expected agreements for each coder pair.
pi() Scott 1955; here, multi-pi.
weighted_kappa([max_distance]) Cohen 1968
weighted_kappa_pairwise(cA, cB[, max_distance]) Cohen 1968

Attributes

unicode_repr() <==> repr(x)