nltk.parse.RecursiveDescentParser

class nltk.parse.RecursiveDescentParser(grammar, trace=0)[source]

A simple top-down CFG parser that parses texts by recursively expanding the fringe of a Tree, and matching it against a text.

RecursiveDescentParser uses a list of tree locations called a “frontier” to remember which subtrees have not yet been expanded and which leaves have not yet been matched against the text. Each tree location consists of a list of child indices specifying the path from the root of the tree to a subtree or a leaf; see the reference documentation for Tree for more information about tree locations.

When the parser begins parsing a text, it constructs a tree containing only the start symbol, and a frontier containing the location of the tree’s root node. It then extends the tree to cover the text, using the following recursive procedure:

  • If the frontier is empty, and the text is covered by the tree, then return the tree as a possible parse.
  • If the frontier is empty, and the text is not covered by the tree, then return no parses.
  • If the first element of the frontier is a subtree, then use CFG productions to “expand” it. For each applicable production, add the expanded subtree’s children to the frontier, and recursively find all parses that can be generated by the new tree and frontier.
  • If the first element of the frontier is a token, then “match” it against the next token from the text. Remove the token from the frontier, and recursively find all parses that can be generated by the new tree and frontier.
See:nltk.grammar

Methods

__init__(grammar[, trace]) Create a new RecursiveDescentParser, that uses grammar to parse texts.
grammar()
parse(tokens)
parse_all(sent, *args, **kwargs)
rtype:list(Tree)
parse_one(sent, *args, **kwargs)
rtype:Tree or None
parse_sents(sents, *args, **kwargs) Apply self.parse() to each element of sents.
trace([trace]) Set the level of tracing output that should be generated when parsing a text.