Red-black tree

An AVL tree guarantees logarithmic insertion, deletion, and search. But it makes a lot of rotations. In most applications, insertions are randomly ordered and so are deletions. So, the trees would sort of balance out eventually. However, since the AVL tree is too quick to rotate, it may make very frequent rotations in opposite directions even when it would be unnecessary, had it been waiting for the future values to be inserted. This can be avoided using a different approach: knowing when to rotate a subtree. This approach is called a red-black tree.

In a red-black tree, the nodes have a color, either black or red. The colors can be switched during the operations on the node, but they have to follow these conditions:

  • The root has to be black
  • A red node cannot have a black child
  • The black height of any subtree rooted by any node is equal to the black height of the subtree rooted by the sibling node

Now what is the black height of a subtree? It is the number of black nodes found from the root to the leaf. When we say leaf, we really mean null children, which are considered black and allow a parent to be red without violating rule 2. This is the same no matter which path we take. This is because of the third condition. So the third condition can also be restated as this: the number of black nodes in the path from the root of any subtree to any of its leaves is the same, irrespective of which leave we choose.

For ease of manipulation, the null children of the leaves are also considered sort of half nodes; null children are always considered black and are the only ones really considered as leaves as well. So leaves don't contain any value. But they are different from the conventional leaves in other red-black trees. New nodes can be added to the leaves but not in a red-black tree; this is because the leaves here are null nodes. So we will not draw them out explicitly or put them in the code. They are only helpful to compute and match black heights:

Red-black tree

An example of a red-black tree

In our example of the red-black tree of height 4, the null nodes are black, which are not shown (in print copy, the light-colored or gray nodes are red nodes and dark-colored nodes are black nodes).

Both insertion and deletion are more complicated than the AVL tree, as there are more cases that we need to handle. We will discuss this in the following sections.

Insertion

Insertion is done in the same way we do it with BST. After an insertion is complete, the new node is colored red. This preserves the black height, but it can result in a red node being a child of another red node, which would violate condition 2. So we do some manipulation to fix this. The following two figures show four cases of insertions:

Insertion

Case 1 and 2 of red-black tree insertion

Insertion

Case 3 and 4 of red-black tree insertion

Let's discuss the insertions case by case. Notice that the trees in the diagram look black and unbalanced. But this is only because we have not drawn the entire tree; it's just a part of the tree we are interested in. The important point is that the black height of none of the nodes change because of whatever we do. If the black height must be increased to fit the new node, it must be at the top level; so we simply move it up to the parent. The four cases are as follows:

  1. The parent is black. In this case, nothing needs to be done as it does not violate any of the constraints.
  2. Both parent and uncle are red. In this case, we repaint parent, uncle, and grandparent and the black heights are still unchanged. Notice now that no constraints are violated. If, however, the grandparent is the root, keep it black. This way, the entire tree's black height is increased by 1.
  3. The parent is red and uncle is black. The newly added node is on the same side of the parent as the parent is of the grandparent. In this case, we make a rotation and repaint. We first repaint the parent and grandparent and then rotate the grandparent.
  4. This is the case that is similar to case 3, except the newly added node is on the opposite side of the parent as the parent is of the grandparent. Case 3 cannot be applied here because doing so will change the black height of the newly added node. In this case, we rotate the parent to make it the same as case 3.
  5. Note that all the cases can happen in the opposite direction, that is, in mirror image. We will handle both the cases the same way.

Let's create our RedBlackTree class extending the BinarySearchTree class. We have to again extend the Node class and include a flag to know whether the node is black:

public class RedBlackTree<E extends Comparable<E>> extends BinarySearchTree<E>{
    public static class Node<E> extends BinaryTree.Node<E>{
        protected int blackHeight = 0;
        protected boolean black = false;
        public Node(BinaryTree.Node parent,
                    BinaryTree containerTree, E value) {
            super(parent, containerTree, value);
        }
    }

    @Override
    protected  BinaryTree.Node<E> newNode(
      BinaryTree.Node<E> parent, BinaryTree<E> containerTree, E value) {
        return new Node(parent, containerTree, value);
    }
...
}

We now add a utility method that returns whether a node is black. As explained earlier, a null node is considered black:

    protected boolean nullSafeBlack(Node<E> node){
        if(node == null){
            return true;
        }else{
            return node.black;
        }
    }

Now we're ready to define the method of rebalancing after we do an insertion. This method works as described in the four cases earlier. We maintain a nodeLeftGrandChild flag that stores whether the parent is the left child of the grand parent or its right child. This helps us find the uncle and also rotate in the correct direction:

    protected void rebalanceForInsert(Node<E> node){
        if(node.getParent() == null){
            node.black = true;
        }else{
            Node<E> parent = (Node<E>) node.getParent();
            if(parent.black){
                return;
            }else{
                Node<E> grandParent = (Node<E>) parent.getParent();
                boolean nodeLeftGrandChild = grandParent.getLeft()== parent;

                Node<E> uncle = nodeLeftGrandChild?
                  (Node<E>) grandParent.getRight()
                  : (Node<E>) grandParent.getLeft();
                if(!nullSafeBlack(uncle)){
                    if(grandParent!=root)
                        grandParent.black = false;
                        uncle.black = true;
                        parent.black = true;
                        rebalanceForInsert(grandParent);
                }else{
                    boolean middleChild = nodeLeftGrandChild?
                      parent.getRight() == node:parent.getLeft() == node;
                    if (middleChild){
                        rotate(parent, nodeLeftGrandChild);
                        node = parent;
                        parent = (Node<E>) node.getParent();
                    }
                    parent.black = true;
                    grandParent.black = false;
                    rotate(grandParent, !nodeLeftGrandChild);
                }
            }

        }
    }

The insertion is now done as follows:

    @Override
    public BinaryTree.Node<E> insertValue(E value) {
        Node<E> node = (Node<E>) super.insertValue(value);
        if(node!=null)
            rebalanceForInsert(node);
        return node;
    }

Deletion

Deletion starts with a normal binary search tree deletion. If you remember, this always involves deletion of a node with at most one child. Deletion of an internal node is done by first copying the value of the leftmost node of the right subtree and deleting it. So we will consider only this case:

Deletion

Case 1, 2, and 3 of deletion in a red-black tree

After the deletion is done, the parent of the deleted node either has no child or has one child, which was originally its grandchild. During the insertion, the problem we needed to solve was a red child of a red parent. In a deletion process, this cannot happen. But it can cause the black height to change.

One simple case is that if we delete a red node, the black height does not change anything, so we don't have to do anything. Another simple case is that if the deleted node were black and the child red, we can simply repaint the child black in order to restore the black height.

A black child cannot really happen because that would mean the original tree was black and unbalanced, as the deleted node had a single black child. But since recursion is involved, a black child can actually arise while moving up the path with recursive rebalancing. In the following discussion, we only look at cases where the deleted node was black and the child was also black (or null child, which is considered black). Deletion is done as per the following cases, as shown in the figures Case 1 and 2 and 3 of deletion in a red-black tree and Case 4, 5 and 6 of deletion from a red-black tree:

  1. The first case we have is when the parent, sibling, and both the nephews are black. In this case, we can simply repaint the sibling to red, which will make the parent black and balanced. However, the black height of the whole subtree will reduce by one; hence, we must continue rebalancing from the parent.
  2. This is the case when the parent and sibling are black, but the away nephew is red. In this case, we cannot repaint the sibling as this would cause the red sibling to have a red child, violating constraint 2. So we first repaint the red nephew to black and then rotate to fix the black height of the nephew while fixing the black height of the child.
  3. When the near nephew is red instead of the away nephew, the rotation does not restore the black height of the near nephew that has been repainted. So, we repaint NN but do a double rotation instead.
  4. Now consider what happens when the sibling is red. We first repaint the parent and sibling using opposite colors and rotating P. But this does not change the black height of any node; it reduces the case to case 5 or 6, which we will discuss now. So we simply call the rebalancing code again recursively.
  5. We are now done with all the cases where the parent was black. This is a case where the parent is red. In this case, we consider the near nephew to be black. Simply rotating the parent fixes the black height.
  6. Our final case is when the parent is red and the near nephew is red. In this case, we recolor the parent and do a double rotation. Notice that the top node remains red. This is not a problem because the original top node, which is the parent node, was also red and hence its parent must be black.
    Deletion

    Case 4, 5, and 6 of deletion from a red-black tree

Now we can define the rebalanceForDelete method coding all the preceding cases:

    protected void rebalanceForDelete(Node<E> parent, boolean nodeDirectionLeft){
        if(parent==null){
            return;
        }
        Node<E> node = (Node<E>) (nodeDirectionLeft? parent.getLeft(): parent.getRight());
        if(!nullSafeBlack(node)){
            node.black = true;
            return;
        }

        Node<E> sibling = (Node<E>) (nodeDirectionLeft? parent.getRight(): parent.getLeft());


        Node<E> nearNephew = (Node<E>) (nodeDirectionLeft?sibling.getLeft():sibling.getRight());

        Node<E> awayNephew = (Node<E>) (nodeDirectionLeft?sibling.getRight():sibling.getLeft());

        if(parent.black){
            if(sibling.black){
                if(nullSafeBlack(nearNephew) && nullSafeBlack(awayNephew)){
                    sibling.black = false;
                    if(parent.getParent()!=null){
                        rebalanceForDelete (
                          (Node<E>) parent.getParent(),
                          parent.getParent().getLeft() == parent);
                    }
                }else if(!nullSafeBlack(awayNephew)){
                    awayNephew.black = true;
                    rotate(parent, nodeDirectionLeft);
                }else{
                    nearNephew.black = true;
                    rotate(sibling, !nodeDirectionLeft);
                    rotate(parent, nodeDirectionLeft);
                }

            }else{
                parent.black = false;
                sibling.black = true;
                rotate(parent, nodeDirectionLeft);
                rebalanceForDelete(parent, nodeDirectionLeft);
            }
        }else{

            if(nullSafeBlack(nearNephew)){
                rotate(parent, nodeDirectionLeft);
            }else{
                parent.black = true;
                rotate(sibling, !nodeDirectionLeft);
                rotate(parent, nodeDirectionLeft);
            }
        }

    }

Now we override the deleteValue method to invoke rebalancing after the deletion. We only need to rebalance if the deleted node was black. We first check that. Then, we need to figure out whether the deleted child was a left child of the parent or the right child. After that, we can invoke the rebalanceForDelete method:

    @Override
    public BinaryTree.Node<E> deleteValue(E value) {
        Node<E> node = (Node<E>) super.deleteValue(value);

        if(node !=null && node.black && node.getParent()!=null){
            Node<E> parentsCurrentChild = (Node<E>) (node.getLeft() == null ? node.getRight(): node.getLeft());
            if(parentsCurrentChild!=null){
                boolean isLeftChild = parentsCurrentChild.getParent().getLeft() == parentsCurrentChild;
                rebalanceForDelete(
                        (Node<E>) node.getParent(), isLeftChild);
            }else{
                boolean isLeftChild = node.getParent().getRight()!=null;
                rebalanceForDelete(
                        (Node<E>) node.getParent(), isLeftChild);
            }

        }
        return node;
    }

The worst case of a red-black tree

What is the worst possible red-black tree? We try to find out the same way we did in the case of the AVL tree. This one is a little more complicated, though. To understand the worst tree, we must take into account the black height. To fit the minimum number of nodes n into height h, we need to first choose a black height. Now it is desirable to have as few black nodes as possible so that we don't have to include black nodes for balancing the black height in the siblings of the nodes we are trying to stretch the height with. Since a red node cannot be the parent of another, we must have alternate black nodes. We consider height h and an even number so that the black height is h/2 = l. For simplicity, we don't count the black null nodes for either the height or the black height. The next figure shows some examples of the worst trees:

The worst case of a red-black tree

Worst red-black trees

The general idea is, of course, to have one path with the maximum possible height. This path should be stuffed with the maximum number of red nodes and the other paths filled with the least number of nodes, that is, with only black nodes. The general idea is shown in the next figure.

The number of nodes in a full black tree of height l-1 is of course 2 l-1 – 1. So, if the number of nodes for height h = 2l is f(l), then we have the recursive formula:

f(l) = f(l-1) + 2 ( 2l-1 – 1) + 2
=> f(l) = f(l-1) + 2l

Now, from the preceding figure, we can already see that f(1) = 2, f(2) = 6, and f(3) = 14. It looks like the formula should be f(l) = 2 l-1 -2. We already have the base cases. If we can prove that if the formula is true for l, then it is also true for l+1, we would be able to prove the formula for all l by induction. This is what we will try to do:

The worst case of a red-black tree

General idea of the worst red-black tree

We already have f(l+1) = f(l) + 2l+1 and we also assume f(l) = 2l+1-2. So this is the case: f(l+1) = 2l+1-2 + 2l+1 = 2l+2-2. Hence, if the formula holds true for l, it also holds true for l+1; therefore, it is proved by induction.

So the minimum number of nodes is as follows:

n = f(l) =  2l+2-2. 
=> lg n = lg ( 2l+2-2)   
=> lg n >  lg ( 2l+1)
=> lg n > l+1
=> l + 1< lg n
=> l < lg n
=> l = O (lg n)

Therefore, a red-black tree has a guaranteed logarithmic height; from this, it is not hard to derive that the search, insertion, and deletion operations are all logarithmic.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset