nltk.Tree

class nltk.Tree(node, children=None)[source]

A Tree represents a hierarchical grouping of leaves and subtrees. For example, each constituent in a syntax tree is represented by a single Tree.

A tree’s children are encoded as a list of leaves and subtrees, where a leaf is a basic (non-tree) value; and a subtree is a nested Tree.

>>> from nltk.tree import Tree
>>> print(Tree(1, [2, Tree(3, [4]), 5]))
(1 2 (3 4) 5)
>>> vp = Tree('VP', [Tree('V', ['saw']),
...                  Tree('NP', ['him'])])
>>> s = Tree('S', [Tree('NP', ['I']), vp])
>>> print(s)
(S (NP I) (VP (V saw) (NP him)))
>>> print(s[1])
(VP (V saw) (NP him))
>>> print(s[1,1])
(NP him)
>>> t = Tree.fromstring("(S (NP I) (VP (V saw) (NP him)))")
>>> s == t
True
>>> t[1][1].set_label('X')
>>> t[1][1].label()
'X'
>>> print(t)
(S (NP I) (VP (V saw) (X him)))
>>> t[0], t[1,1] = t[1,1], t[0]
>>> print(t)
(S (X him) (VP (V saw) (NP I)))

The length of a tree is the number of children it has.

>>> len(t)
2

The set_label() and label() methods allow individual constituents to be labeled. For example, syntax trees use this label to specify phrase tags, such as “NP” and “VP”.

Several Tree methods use “tree positions” to specify children or descendants of a tree. Tree positions are defined as follows:

  • The tree position i specifies a Tree’s ith child.
  • The tree position () specifies the Tree itself.
  • If p is the tree position of descendant d, then p+i specifies the ith child of d.

I.e., every tree position is either a single index i, specifying tree[i]; or a sequence i1, i2, ..., iN, specifying tree[i1][i2]...[iN].

Construct a new tree. This constructor can be called in one of two ways:

  • Tree(label, children) constructs a new tree with the

    specified label and list of children.

  • Tree.fromstring(s) constructs a new tree by parsing the string s.

Methods

__init__(node[, children])
append L.append(object) – append object to end
chomsky_normal_form([factor, horzMarkov, ...]) This method can modify a tree in three ways:
collapse_unary([collapsePOS, collapseRoot, ...]) Collapse subtrees with a single child (ie.
convert(tree) Convert a tree between different subtypes of Tree.
copy([deep])
count(...)
draw() Open a new window containing a graphical diagram of this tree.
extend L.extend(iterable) – extend list by appending elements from the iterable
flatten() Return a flat version of the tree, with all non-root non-terminals removed.
freeze([leaf_freezer])
fromstring(s[, brackets, read_node, ...]) Read a bracketed tree string and return the resulting tree.
height() Return the height of the tree.
index((value, [start, ...) Raises ValueError if the value is not present.
insert L.insert(index, object) – insert object before index
label() Return the node label of the tree.
leaf_treeposition(index)
return:The tree position of the index-th leaf in this
leaves() Return the leaves of the tree.
pformat([margin, indent, nodesep, parens, ...])
return:A pretty-printed string representation of this tree.
pformat_latex_qtree() Returns a representation of the tree compatible with the LaTeX qtree package.
pop(...) Raises IndexError if list is empty or index is out of range.
pos() Return a sequence of pos-tagged words extracted from the tree.
pprint(**kwargs) Print a string representation of this Tree to ‘stream’
pretty_print([sentence, highlight, stream]) Pretty-print this tree as ASCII or Unicode art.
productions() Generate the productions that correspond to the non-terminal nodes of the tree.
remove L.remove(value) – remove first occurrence of value.
reverse L.reverse() – reverse IN PLACE
set_label(label) Set the node label of the tree.
sort L.sort(cmp=None, key=None, reverse=False) – stable sort IN PLACE;
subtrees([filter]) Generate all the subtrees of this tree, optionally restricted to trees matching the filter function.
treeposition_spanning_leaves(start, end)
return:The tree position of the lowest descendant of this
treepositions([order])
>>> t = Tree.fromstring("(S (NP (D the) (N dog)) (VP (V chased) (NP (D the) (N cat))))")
un_chomsky_normal_form([expandUnary, ...]) This method modifies the tree in three ways:
unicode_repr()

Attributes

node Outdated method to access the node value; use the label() method instead.