class nltk.parse.ProbabilisticNonprojectiveParser[source]

A probabilistic non-projective dependency parser.

Nonprojective dependencies allows for “crossing branches” in the parse tree which is necessary for representing particular linguistic phenomena, or even typical parses in some languages. This parser follows the MST parsing algorithm, outlined in McDonald(2005), which likens the search for the best non-projective parse to finding the maximum spanning tree in a weighted directed graph.

>>> class Scorer(DependencyScorerI):
...     def train(self, graphs):
...         pass
...     def score(self, graph):
...         return [
...             [[], [5],  [1],  [1]],
...             [[], [],   [11], [4]],
...             [[], [10], [],   [5]],
...             [[], [8],  [8],  []],
...         ]
>>> npp = ProbabilisticNonprojectiveParser()
>>> npp.train([], Scorer())
>>> parses = npp.parse(['v1', 'v2', 'v3'], [None, None, None])
>>> len(list(parses))


__init__() Creates a new non-projective parser.
best_incoming_arc(node_index) Returns the source of the best incoming arc to the
collapse_nodes(new_node, cycle_path, ...) Takes a list of nodes that have been identified to belong to a cycle, and collapses them into on larger node.
compute_max_subtract_score(column_index, ...) When updating scores the score of the highest-weighted incoming arc is subtracted upon collapse.
compute_original_indexes(new_indexes) As nodes are collapsed into others, they are replaced by the new node in the graph, but it’s still necessary to keep track of what these original nodes were.
initialize_edge_scores(graph) Assigns a score to every edge in the DependencyGraph graph.
parse(tokens, tags) Parses a list of tokens in accordance to the MST parsing algorithm for non-projective dependency parses.
train(graphs, dependency_scorer) Trains a DependencyScorerI from a set of DependencyGraph objects, and establishes this as the parser’s scorer.
update_edge_scores(new_node, cycle_path) Updates the edge scores to reflect a collapse operation into new_node.