nltk.tag.BrillTagger.batch_tag_incremental¶

BrillTagger.batch_tag_incremental(sequences, gold)[source]¶

Tags by applying each rule to the entire corpus (rather than all rules to a single sequence). The point is to collect statistics on the test set for individual rules.

NOTE: This is inefficient (does not build any index, so will traverse the entire corpus N times for N rules) – usually you would not care about statistics for individual rules and thus use batch_tag() instead

Parameters:	sequences (list of list of strings) – lists of token sequences (sentences, in some applications) to be tagged gold (list of list of strings) – the gold standard
Returns:	tuple of (tagged_sequences, ordered list of rule scores (one for each rule))