Chapter 5. Useful Examplesof Java Thread Programming

In the previous chapters, we examined some of the tools necessary to support the synchronization of data between threads. With these tools, we now are able to have our own threads interoperate with each other, with the system threads, or with the threads started by the standard Java libraries. This is possible because the tools allow for a thread to examine and modify shared data in a safe manner without race conditions. The ability to handle data safely provides us with the ability to exchange data, which, in turn, allows us to accomplish tasks in separate threads safely, which ultimately allows us to accomplish our goal.

In other words, we can now say that threading itself is just an implementation detail of our program. Ideally, true threading should feel like just another object that does something. And while threading itself is a powerful tool, in the end all you want to accomplish is to play the audio clip or read the data socket.

In this chapter, we examine some of the uses of threads. We will show how threads solve certain problems, and discuss the implementation details of these solutions, the threads themselves, and the mechanisms that are used to control and synchronize the threads. We will examine threads from the perspective of solving problems instead of examining features of the threading system.

Data Structures and Containers

Interestingly, our first set of examples does not require any threads to be created at all. Our first topic is the data types that can be used or passed between threads. When you create a data object, you do not always know how many threads will access that object: while these data objects may be accessed by many threads, they may also only be accessed by a single thread (in which case, synchronization of the object is not necessary). To begin, let’s examine some operating system mechanisms used to pass data between processes.

In the Unix operating system, the first interprocess communications (IPC) techniques provided were message queues, semaphores, and shared memory. While Unix has added many mechanisms, these three are still popular and heavily used in many applications. The IPC mechanisms of Java—the synchronization lock and the wait and notify mechanism—are specifically for synchronization. Unlike the message queue and shared memory, no real data is actually passed between threads: the concern is synchronization, not communication.[6] The theory is that communication is easy if synchronization tools are available. For now, let’s take a look at the message queue and shared memory and see if these communication mechanisms are useful for communicating between threads.

The Message Queue

We’ll start with message queues:

import java.util.*;

public class MsgQueue {
    Vector queue = new Vector();
    public synchronized void send(Object obj) {
        queue.addElement(obj);
    }

    public synchronized Object recv() {
        if (queue.size() == 0) return null;

        Object obj = queue.firstElement();
        queue.removeElementAt(0);
        return obj;
    }
}

The implementation of the message queue is incredibly simple once we have the proper synchronization tools. In the multitasking paradigm, the operating system has to deliver the data that is sent into the queue from one application to another, as well as synchronizing the communication itself. Since threads share the same address space, data passing is accomplished by using a reference to a common data object. Once we are able to synchronize access to this data object, sending and receiving data using this message queue is simple. In our version of the message queue, the queue is implemented using the Vector class that is implemented in the Java system. We simply need to make sure that we can safely add to and remove from this queue; this is accomplished by making sure that all accesses to this queue are synchronized. This implementation is so easy that we do not even need a MsgQueue class. Instead, we could have used the synchronized block mechanism on the Vector object directly, adding and removing the messages directly to and from the Vector. In other words, the message queue IPC is as simple to implement as any container class (that is, a class that holds collections of objects).

Shared Memory

A shared memory implementation may be as follows:

public class ShareMemory extends BusyFlag {
    byte memory[];
    public ShareMemory (int size) {
        memory = new byte[size];
    }

    public synchronized byte[] attach() {
        getBusyFlag();
        return memory;
    }

    public synchronized void detach() {
        freeBusyFlag();
    }
}

Just like the MsgQueue class, the ShareMemory class is also not difficult and may be unnecessary: we could just as easily have synchronized on the byte array object directly and we would not have needed to implement this class. The only advantage is that since we implemented the ShareMemory class as a subclass of the BusyFlag class, we can attach() and detach() from this shared memory at any scope, including a scope that is bigger than a single method.

The real point behind all this is that even though threads are somewhat analogous to processes, we have to learn to think about data differently than we would between two processes. To share data between processes requires specialized functions that are implemented in the operating system. But data in a Java program is always shared between threads. The fact that we don’t have to do anything special to share this data means that we don’t really need IPCs as we know them: instead, we have to constantly think threaded. Every time we develop a class, we should be concerned that it may be used by many threads simultaneously, whether our program actually contains many threads or not.

The Circular Linked List

So what about all the container classes we will develop? The linked list? The B-tree? The graph? Any other data structure we care to invent? When we implement one of these classes, should we implement it so that it can be accessed by multiple threads in parallel? The answer could be personal preference or corporate policy. It is, however, not difficult to make any container class safe for multiple threads. And it is arguably better to make it safe and not worry whether sometime in the future this container might be used across multiple threads. Let’s take a look at a container class most of us have implemented at one time or another, the circularly linked list:

class CircularListNode {
    Object o;
    CircularListNode next;
    CircularListNode prev;
}

Just like any other linked list most of us have written, we will use a simple data structure node to store the object. We actually don’t care what type of object we have in our list, so all we need is a reference to the Object class. This allows us to hold any type of object in our container (primitive types, of course, will need to be wrapped in their corresponding objects). We will also keep a reference to the previous node and to the next node in the circularly linked list. This is an implementation detail; we keep two references merely for efficiency in our search:

public class CircularList {
    private CircularListNode current;

    public synchronized void insert(Object o) {
        CircularListNode tn = new CircularListNode();
        tn.o = o;
        if (current == null) {
            tn.next = tn.prev = tn;
            current = tn;
        } else {                // Add before current node
            tn.next = current;
            tn.prev = current.prev;
            current.prev.next = tn;
            current.prev = tn;
        }
    }

    public synchronized void delete(Object o) {
        CircularListNode p = find(o);
        CircularListNode next = p.next;
        CircularListNode prev = p.prev;
        if (p == p.next) {       // Last object on the list
            current = null;
            return;
        }
        prev.next = next;
        next.prev = prev;
        if (current == p) current = next;
    }

    private CircularListNode find(Object o) {
        CircularListNode p = current;
        if (p == null)
            throw new IllegalArgumentException();
        do {
            if (p.o == o) return p;
            p = p.next;
        } while (p != current);
        throw new IllegalArgumentException();
    }

    public synchronized Object locate(Object o) {
        CircularListNode p = current;
        do {
            if (p.o.equals(o)) return p.o;
            p = p.next;
        } while (p != current);
        throw new IllegalArgumentException();
    }

    public synchronized Object getNext() {
        if (current == null)
            return null;
        current = current.next;
        return current.o;
    }
}

The implementation of our CircularList class is probably no different from any circularly linked list implementation we may have done before. We simply provide methods to insert() and delete() from the circularly linked list; once that list has the objects, we can pass this list to other methods that may process the list. This processing is done by simply cycling through the list with the getNext() method or by searching for a particular object using the locate() method.

How do we make this CircularList class safe for use by multiple threads? It’s as simple as declaring all the methods as synchronized. By adding the synchronized keyword, we can now use this CircularList class safely across different threads simultaneously. In other words, by taking a few minutes to make the class safe, we can use this class as a form of interthread communication. With enough practice, we should use the synchronization tools without much effort.

Note that the find() method is not synchronized: as a private method that is called only by synchronized methods, there is no reason for it to be synchronized, though it wouldn’t have hurt had we done so. Note also the subtle difference between the find() and locate() methods: as an internal method, the find() method returns objects of CircularListNode; the locate() method returns the actual object that was inserted into the list.

Synchronization and Efficiency

It can be argued that synchronizing a class just because it might be used by multiple threads is inefficient: it takes a certain amount of time to acquire the synchronization lock. This is a trade-off a developer must be aware of when designing a large program.

In this book, we’ve taken the view that it is easier to solve performance problems when they occur than to find bugs caused by a lack of data synchronization. For the most part, the Java API takes this view as well: classes such as the Hashtable and Vector class are all correctly synchronized and are safe to use in multithreaded programs.

In Java 2, however, there is a new set of container classes that are implemented with an eye toward efficiency rather than being thread safe. These classes are referred to as collection classes ; they all extend either the java.util.Collection class or the java.util.Map class. For example, there is a HashMap class that provides the same semantics as the Hashtable class, except that the methods of the HashMap are not synchronized. Similarly, there is an ArrayList class that provides the same semantics as the Vector class, and so on.

Using these classes directly in your Java 2-based program may provide a small performance benefit. That’s somewhat debatable, however, since the synchronization code of the reference virtual machine was completely overhauled in 1.1.6, with the result that the penalty for obtaining a synchronization lock was drastically reduced. So you’d need to make very heavy use of the methods in the classes that we’re talking about to see any noticeable benefit for most programs. On the other hand, if you use the HashMap class without realizing that the same HashMap is being accessed by two threads and that a race condition causes an error in your program every 100 days, how much have you benefited from the faster code?

In Chapter 8, we’ll show how we can safely use these and other thread-unsafe classes.



[6] This applies to most threading systems. In Solaris or POSIX threads, the main tools are the mutex lock, reader-writer locks, semaphores, and conditional variables, none of which actually passes any real data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset