Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Creating a shallow tree

In the previous recipe, we flattened a deep Tree by only keeping the lowest level subtrees. In this recipe, we'll keep only the highest level subtrees instead.

How to do it...

We'll be using the first parsed sentence from the treebank corpus as our example. Recall from the previous recipe that the sentence Tree looks like this:

The shallow_tree() function defined in transforms.py eliminates all the nested subtrees, keeping only the top subtree labels:

from nltk.tree import Tree

def shallow_tree(tree):
  children = []

  for t in tree:
    if t.height() < 3:
      children.extend(t.pos())
    else:
      children.append(Tree(t.label(), t.pos()))

  return Tree(tree.label(), children)

Using it on the first parsed sentence in treebank results in a Tree with only two subtrees:

>>> from transforms import shallow_tree
>>> shallow_tree(treebank.parsed_sents()[0])
Tree('S', [Tree('NP-SBJ', [('Pierre', 'NNP'), ('Vinken', 'NNP'), (',', ','), ('61', 'CD'), ('years', 'NNS'), ('old', 'JJ'), (',', ',')]), Tree('VP', [('will', 'MD'), ('join', 'VB'), ('the', 'DT'), ('board', 'NN'), ('as', 'IN'), ('a', 'DT'), ('nonexecutive', 'JJ'), ('director', 'NN'), ('Nov.', 'NNP'), ('29', 'CD')]), ('.', '.')])

We can visually and programmatically see the difference in the following diagram:

>>> treebank.parsed_sents()[0].height()
7
>>> shallow_tree(treebank.parsed_sents()[0]).height()
3

As in the previous recipe, the height of the new tree is 3 so it can be used for training a chunker.

How it works...

The shallow_tree() function iterates over each of the top-level subtrees in order to create new child trees. If the height() of a subtree is less than 3, then that subtree is replaced by a list of its part-of-speech tagged children. All other subtrees are replaced by a new Tree whose children are the part-of-speech tagged leaves. This eliminates all nested subtrees while retaining the top-level subtrees.

This function is an alternative to flatten_deeptree() from the previous recipe, for when you want to keep the higher-level tree labels and ignore the lower-level labels.

Table of Contents for
Creating a shallow tree

Creating a shallow tree

How to do it...

How it works...

See also

Table of Contents for Creating a shallow tree

Create new playlist

Sign In

Sign Up

Creating a shallow tree

How to do it...

How it works...

See also

Table of Contents for
Creating a shallow tree