nltk.parse

NLTK Parsers

Classes and interfaces for producing tree structures that represent the internal organization of a text. This task is known as “parsing” the text, and the resulting tree structures are called the text’s “parses”. Typically, the text is a single sentence, and the tree structure represents the syntactic structure of the sentence. However, parsers can also be used in other domains. For example, parsers can be used to derive the morphological structure of the morphemes that make up a word, or to derive the discourse structure for a set of utterances.

Sometimes, a single piece of text can be represented by more than one tree structure. Texts represented by more than one tree structure are called “ambiguous” texts. Note that there are actually two ways in which a text can be ambiguous:

  • The text has multiple correct parses.
  • There is not enough information to decide which of several candidate parses is correct.

However, the parser module does not distinguish these two types of ambiguity.

The parser module defines ParserI, a standard interface for parsing texts; and two simple implementations of that interface, ShiftReduceParser and RecursiveDescentParser. It also contains three sub-modules for specialized kinds of parsing:

  • nltk.parser.chart defines chart parsing, which uses dynamic programming to efficiently parse texts.
  • nltk.parser.probabilistic defines probabilistic parsing, which associates a probability with each parse.

Functions

extract_test_sentences(string[, ...]) Parses a string with one test sentence per line.
load_parser(grammar_url[, trace, parser, ...]) Load a grammar from a file, and build a parser based on that grammar.

Classes

BllipParser([parser_model, ...]) Interface for parsing with BLLIP Parser.
BottomUpChartParser(grammar, **parser_args) A ChartParser using a bottom-up parsing strategy.
BottomUpLeftCornerChartParser(grammar, ...) A ChartParser using a bottom-up left-corner parsing strategy.
BottomUpProbabilisticChartParser(grammar[, ...]) An abstract bottom-up parser for PCFG grammars that uses a Chart to record partial results.
ChartParser(grammar[, strategy, trace, ...]) A generic chart parser.
DependencyEvaluator(parsed_sents, gold_sents) Class for measuring labelled and unlabelled attachment score for dependency parsing.
DependencyGraph([tree_str, cell_extractor, ...]) A container for the nodes and labelled edges of a dependency structure.
EarleyChartParser(grammar, **parser_args)
FeatureBottomUpChartParser(grammar, ...)
FeatureBottomUpLeftCornerChartParser(...)
FeatureChartParser(grammar[, strategy, ...])
FeatureEarleyChartParser(grammar, **parser_args)
FeatureIncrementalBottomUpChartParser(...)
FeatureIncrementalBottomUpLeftCornerChartParser(...)
FeatureIncrementalChartParser(grammar[, ...])
FeatureIncrementalTopDownChartParser(...)
FeatureTopDownChartParser(grammar, **parser_args)
IncrementalBottomUpChartParser(grammar, ...)
IncrementalBottomUpLeftCornerChartParser(...)
IncrementalChartParser(grammar[, strategy, ...]) An incremental chart parser implementing Jay Earley’s
IncrementalLeftCornerChartParser(grammar, ...)
IncrementalTopDownChartParser(grammar, ...)
InsideChartParser(grammar[, beam_size, trace]) A bottom-up parser for PCFG grammars that tries edges in descending order of the inside probabilities of their trees.
LeftCornerChartParser(grammar, **parser_args)
LongestChartParser(grammar[, beam_size, trace]) A bottom-up parser for PCFG grammars that tries longer edges before shorter ones.
MaltParser(parser_dirname[, model_filename, ...]) A class for dependency parsing with MaltParser.
NaiveBayesDependencyScorer() A dependency scorer built around a MaxEnt classifier.
NonprojectiveDependencyParser(dependency_grammar) A non-projective, rule-based, dependency parser.
ParserI A processing class for deriving trees that represent possible structures for a sequence of tokens.
ProbabilisticNonprojectiveParser() A probabilistic non-projective dependency parser.
ProbabilisticProjectiveDependencyParser() A probabilistic, projective dependency parser.
ProjectiveDependencyParser(dependency_grammar) A projective, rule-based, dependency parser.
RandomChartParser(grammar[, beam_size, trace]) A bottom-up parser for PCFG grammars that tries edges in random order.
RecursiveDescentParser(grammar[, trace]) A simple top-down CFG parser that parses texts by recursively expanding the fringe of a Tree, and matching it against a text.
ShiftReduceParser(grammar[, trace]) A simple bottom-up CFG parser that uses two operations, “shift” and “reduce”, to find a single parse for a text.
SteppingChartParser(grammar[, strategy, trace]) A ChartParser that allows you to step through the parsing process, adding a single edge at a time.
SteppingRecursiveDescentParser(grammar[, trace]) A RecursiveDescentParser that allows you to step through the parsing process, performing a single operation at a time.
SteppingShiftReduceParser(grammar[, trace]) A ShiftReduceParser that allows you to setp through the parsing process, performing a single operation at a time.
TestGrammar(grammar, suite[, accept, reject]) Unit tests for CFG.
TopDownChartParser(grammar, **parser_args) A ChartParser using a top-down parsing strategy.
TransitionParser(algorithm) Class for transition based parser.
UnsortedChartParser(grammar[, beam_size, trace]) A bottom-up parser for PCFG grammars that tries edges in whatever order.
ViterbiParser(grammar[, trace]) A bottom-up PCFG parser that uses dynamic programming to find the single most likely parse for a text.