Self-balancing binary search tree

A binary search tree that remains balanced to some extent when insertion and deletion is carried out is called a self-balancing binary search tree. To create a balanced version of an unbalanced tree, we use a peculiar operation called rotation. We will discuss rotation in the following section:

Self-balancing binary search tree

Rotation of a binary search tree

This figure shows the rotation operation on nodes A and B. Left rotation on A creates the right image, and right rotation on B creates the left image. To visualize a rotation, first think about pulling out the subtree D. This subtree is somewhere in the middle. Now the nodes are rotated in either the left or right direction. In the case of the left rotation, the right child becomes the parent and the parent becomes the left child of the original child. Once this rotation is done, the D subtree is added to the right child's position of the original parent. The right rotation is exactly the same but in the opposite direction.

How does it help balance a tree? Notice the left-hand side of the diagram. You'll realize that the right side looks heavier, however, once you perform left rotation, the left-hand side will appear heavier. Actually, a left rotation decreases the depth of the right subtree by one and increases that of the left subtree by one. Even if, originally, the right-hand side had a depth of 2 when compared to the left-hand side, you could fix it using left rotation. The only exception is the subtree D since the root of D remains at the same level; its maximum depth does not change. A similar argument will hold true for the right rotation as well.

Rotation keeps the search-tree property of the tree unchanged. This is very important if we are going to use it to balance search trees. Let's consider the left rotation. From the positions, we can conclude the following inequalities:

  • Each node in CA
  • A ≤ B
  • A ≤ Each node in DB
  • B ≤ Each node in E

After we perform the rotation, we check the inequalities the same way and we find they are exactly the same. This proves the fact that rotation keeps the search-tree property unchanged. A very similar argument can be made for the right rotation as well. The idea of the algorithm of a rotation is simple: first take the middle subtree out, do the rotation, and reattach the middle subtree. The following is the implementation in our BinaryTree class:

    protected void rotate(Node<E> node, boolean left){

First, let's do some parameter value checks:

        if(node == null){
            throw new IllegalArgumentException("Cannot rotate null node");
        }else if(node.containerTree != this){
            throw  new IllegalArgumentException(
                "Node does not belong to the current tree");
        }
        Node<E> child = null;
        Node<E> grandchild = null;
        Node<E> parent = node.getParent();
        boolean parentDirection;

The child and grandchild we want to move depend on the direction of the rotation:

        if(left){
            child = node.getRight();
            if(child!=null){
                grandchild = child.getLeft();
            }
        }else{
            child = node.getLeft();
            if(child!=null){
                grandchild = child.getRight();
            }
        }

The root node needs to be treated differently as usual:

        if(node != getRoot()){
            if(parent.getLeft()==node){
                parentDirection = true;
            }else{
                parentDirection = false;
            }
            if(grandchild!=null)
                deleteNodeWithSubtree(grandchild);
            if(child!=null)
                deleteNodeWithSubtree(child);
                deleteNodeWithSubtree(node);
            if(child!=null) {
                setChild(parent, child, parentDirection);
                setChild(child, node, left);
            }
            if(grandchild!=null)
                setChild(node, grandchild, !left);
        }else{
            if(grandchild!=null)
                deleteNodeWithSubtree(grandchild);
            if(child!=null)
                deleteNodeWithSubtree(child);
                deleteNodeWithSubtree(node);
            if(child!=null) {
                root = child;
                setChild(child, node, left);
            }
            if(grandchild!=null)
                setChild(node, grandchild, !left);
                root.parent = null;
        }
    }

We now can look at our first self-balancing binary tree called the AVL tree.

AVL tree

AVL tree is our first self-binary search tree. The idea is simple: keep every subtree as balanced as possible. An ideal scenario would be for both the left and right subtrees, starting from every node, to have exactly the same height. However, since the number of nodes are not in the form of 2p -1, where p is a positive integer, we cannot always achieve this. Instead, we allow a little bit of wiggle room. It's important that the difference between the height of the left subtree and the right subtree must not be greater than one. If, during any insert or delete operation, we happen to break this condition, we will apply rotations to fix this. We only have to worry about a difference of two between the heights as we are only thinking of insertion and deletion of one element at a time, and inserting one element or deleting it cannot change the height by more than one. Therefore, our worst case is that there was already a difference of one and the new addition or deletion created one more difference requiring a rotation.

The simplest kind of rotation is shown in the following figure. The triangles represent subtrees of equal heights. Notice that the height of the left subtree is two less than the height of the right subtree:

AVL tree

AVL tree – simple rotation

So we do a left rotation to generate the subtree of the structure, as shown in the preceding diagram. You can see that the heights of the subtrees follow our condition. The simple right rotation case is exactly the same, just in the opposite direction. We must do this for all the ancestors of the node that were either inserted or deleted as the heights of subtrees rooted by these nodes were the only ones affected by it. Since rotations also cause heights to change, we must start from the bottom and walk our way up to the root while doing rotations.

There is one more kind of case called a double rotation. Notice that the height of the subtree rooted by the middle grandchild does not change due to the rotation. So, if this is the reason for the imbalance, a simple rotation will not fix it. It is shown in the following figure:

AVL tree

Simple rotation does not fix this kind of imbalance

Here, the subtree that received an insertion is rooted by D or a node is deleted from the subtree C. In the case of an insertion, notice that there would be no rotation on B as the left subtree of B has a height of only one more than that of its right subtree. A is however unbalanced. The height of the left subtree of A is two less than that of its right subtree. However, if we do a rotation on A, as shown in the preceding figure, it does not fix the problem; only the left-heavy condition is transformed into a right-heavy condition. To resolve this, we need a double rotation, as shown in the next figure. First, we do an opposite direction rotation of the middle grandchild so that it is not unbalanced in the opposite direction. A simple rotation after this will fix the imbalance.

AVL tree

AVL tree double rotation

So we create an AVL tree class, and we add an extra field to the Node class to store the height of the subtree rooted by it:

public class AVLTree<E extends Comparable<E>> 
          extends BinarySearchTree<E>{
    public static class Node<E extends Comparable<E>> 
              extends BinaryTree.Node{
        protected int height = 0;
        public Node(BinaryTree.Node parent,
                    BinaryTree containerTree, E value) {
            super(parent, containerTree, value);
        }
    }

We must override the newNode method to return our extended node:

    @Override
    protected BinaryTree.Node<E> newNode(
      BinaryTree.Node<E> parent, BinaryTree<E> containerTree, E value) {
        return new Node(parent, containerTree, value);
    }

We use a utility method to retrieve the height of a subtree with a null check. The height of a null subtree is zero:

    private int nullSafeHeight(Node<E> node){
        if(node==null){
            return 0;
        }else{
            return node.height;
        }
    }

First, we include a method to compute and update the height of the subtree rooted by a node. The height is one more than that of the maximum height of its children:

    private void nullSafeComputeHeight(Node<E> node){
        Node<E> left = (Node<E>) node.getLeft();
        Node<E> right = (Node<E>) node.getRight();
        int leftHeight = left==null? 0 : left.height;
        int rightHeight = right==null? 0 :right.height;
        node.height =  Math.max(leftHeight, rightHeight)+1;
    }

We also override the rotate method in BinaryTree to update the height of the subtrees after the rotation:

    @Override
    protected void rotate(BinaryTree.Node<E> node, boolean left) {
        Node<E> n = (Node<E>) node;
        Node<E> child;
        if(left){
            child = (Node<E>) n.getRight();
        }else{
            child = (Node<E>) n.getLeft();
        }
        super.rotate(node, left);
        if(node!=null){
            nullSafeComputeHeight(n);
        }
        if(child!=null){
            nullSafeComputeHeight(child);
        }
    }

With the help of these methods, we implement the rebalancing of a node all the way up to the root, as described in the preceding code. The rebalancing bit is done by checking the difference in the height of the left and right subtrees. If the difference is 0, 1, or -1, nothing needs to be done. We simply move up the tree recursively. When the height difference is 2 or -2, this is when we need to rebalance:

    protected void rebalance(Node<E> node){
        if(node==null){
            return;
        }
        nullSafeComputeHeight(node);
        int leftHeight = nullSafeHeight((Node<E>) node.getLeft());
        int rightHeight = nullSafeHeight((Node<E>) node.getRight());
        switch (leftHeight-rightHeight){
            case -1:
            case 0:
            case 1:
                rebalance((Node<E>) node.getParent());
                break;
            case 2:
                int childLeftHeight = nullSafeHeight(
                        (Node<E>) node.getLeft().getLeft());
                int childRightHeight = nullSafeHeight(
                        (Node<E>) node.getLeft().getRight());
                if(childRightHeight > childLeftHeight){
                    rotate(node.getLeft(), true);
                }
                Node<E> oldParent = (Node<E>) node.getParent();
                rotate(node, false);
                rebalance(oldParent);
                break;
            case -2:
                childLeftHeight = nullSafeHeight(
                        (Node<E>) node.getRight().getLeft());
                childRightHeight = nullSafeHeight(
                        (Node<E>) node.getRight().getRight());
                if(childLeftHeight > childRightHeight){
                    rotate(node.getRight(), false);
                }
                oldParent = (Node<E>) node.getParent();
                rotate(node, true);
                rebalance(oldParent);
                break;
        }
    }

Once the rotation is implemented, implementing the insert and delete operations is very simple. We first do a regular insertion or deletion, followed by rebalancing. A simple insertion operation is as follows:

  @Override
    public BinaryTree.Node<E> insertValue(E value) {
        Node<E> node = (Node<E>) super.insertValue(value);
        if(node!=null)
            rebalance(node);
            return node;
    }

The delete operation is also very similar. It only requires an additional check confirming that the node is actually found and deleted:

@Override
    public BinaryTree.Node<E> deleteValue(E value) {
        Node<E> node = (Node<E>) super.deleteValue(value);
        if(node==null){
            return null;
        }
        Node<E> parentNode = (Node<E>) node.getParent();
        rebalance(parentNode);
        return node;
    }

Complexity of search, insert, and delete in an AVL tree

The worst case of an AVL tree is when it has maximum imbalance. In other words, the tree is worst when it reaches its maximum height for a given number of nodes. To find out how much that is, we need to ask the question differently, given a height h: what is the minimum number of nodes (n) that can achieve this? Let the minimum number of nodes required to achieve this be f(h). A tree of height h will have two subtrees, and without any loss of generality, we can assume that the left subtree is higher than the right subtree. We would like both these subtrees to also have a minimum number of nodes. So the height of the left subtree would be f(h-1). We want the height of the right subtree to be minimum, as this would not affect the height of the entire tree. However, in an AVL tree, the difference between the heights of two subtrees at the same level can differ by a maximum of one. The height of this subtree is h-2. So the number of nodes in the right subtree is f(h-2). The entire tree must also have a root, hence the total number of nodes:

f(h) = f(h-1) + f(h-2) + 1

It almost looks like the formula of the Fibonacci sequence, except for the +1 part. Our starting values are 1 and 2 because f(1) = 1 (only the root) and f(2) = 2 (just one child). This is greater than the starting values of the Fibonacci sequence, which are 1 and 1. One thing is of course clear that the number of nodes would be greater than the corresponding Fibonacci number. So, the following is the case:

f(h) ≥ Fh where Fh is the hth Fibonacci number.

We know that for a large enough h, F h ≈ φF h-1 holds true; here φ is the golden ratio (1 + √5)/2. This means F h = C φ h, where C is some constant. So, we have the following:

f(h) ≥ C φ h
=>n ≥ C  φ h
=> log φn ≥  h +  log φ C
=> h = O(  log φn) = O(lg n)

This means even the worst height of an AVL tree is logarithmic, which is what we wanted. Since an insertion processes one node in each level until it reaches the insertion site, the complexity of an insertion is O(lg n); it is the same for performing search and delete operations, and it holds true for the same reason.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset