nltk.chunk.RegexpChunkParser

class nltk.chunk.RegexpChunkParser(rules, chunk_label=u'NP', root_label=u'S', trace=0)[source]

A regular expression based chunk parser. RegexpChunkParser uses a sequence of “rules” to find chunks of a single type within a text. The chunking of the text is encoded using a ChunkString, and each rule acts by modifying the chunking in the ChunkString. The rules are all implemented using regular expression matching and substitution.

The RegexpChunkRule class and its subclasses (ChunkRule, ChinkRule, UnChunkRule, MergeRule, and SplitRule) define the rules that are used by RegexpChunkParser. Each rule defines an apply() method, which modifies the chunking encoded by a given ChunkString.

Variables:
  • _rules – The list of rules that should be applied to a text.
  • _trace – The default level of tracing.

Methods

__init__(rules[, chunk_label, root_label, trace]) Construct a new RegexpChunkParser.
evaluate(gold) Score the accuracy of the chunker against the gold standard.
grammar()
return:The grammar used by this parser.
parse(chunk_struct[, trace])
type chunk_struct:
 Tree
parse_all(sent, *args, **kwargs)
rtype:list(Tree)
parse_one(sent, *args, **kwargs)
rtype:Tree or None
parse_sents(sents, *args, **kwargs) Apply self.parse() to each element of sents.
rules()
return:the sequence of rules used by RegexpChunkParser.
unicode_repr()
return:a concise string representation of this