A simple top-down CFG parser that parses texts by recursively
expanding the fringe of a Tree, and matching it against a
text.
RecursiveDescentParser
uses a list of tree locations called a
“frontier” to remember which subtrees have not yet been expanded
and which leaves have not yet been matched against the text. Each
tree location consists of a list of child indices specifying the
path from the root of the tree to a subtree or a leaf; see the
reference documentation for Tree for more information
about tree locations.
When the parser begins parsing a text, it constructs a tree
containing only the start symbol, and a frontier containing the
location of the tree’s root node. It then extends the tree to
cover the text, using the following recursive procedure:
- If the frontier is empty, and the text is covered by the tree,
then return the tree as a possible parse.
- If the frontier is empty, and the text is not covered by the
tree, then return no parses.
- If the first element of the frontier is a subtree, then
use CFG productions to “expand” it. For each applicable
production, add the expanded subtree’s children to the
frontier, and recursively find all parses that can be
generated by the new tree and frontier.
- If the first element of the frontier is a token, then “match”
it against the next token from the text. Remove the token
from the frontier, and recursively find all parses that can be
generated by the new tree and frontier.