Reading Data Asynchronously

Let’s look at a complete example. One of the primary uses for threads within a Java program is to read data asynchronously. In this section, we’ll develop a class to read a network socket asynchronously.

Why is threading important for I/O? Whether you are reading from or writing to a file or network socket, a common problem exists, namely, that the action of reading or writing depends on other resources. These resources may be other programs; they may be hardware, like the disk or the network; they may be the operating system or browser. These resources may become temporarily unavailable for a variety of reasons: reading from a network socket may involve waiting until the data is available, writing large amounts of data to a file may take a long period of time to complete if the disk is busy with other requests, and so on. Unfortunately, the mechanism to check whether these resources are available does not exist in the Java API. This is particularly a problem for network sockets, where data is likely to take a long time to be transmitted over the network; it is possible for a read from a network socket to wait forever.

The InputStream class does contain the available() method. However, not all input streams support that method, and on a slow network, writing data to a socket is also likely to take a long time. In general, checking for data via the available() method is much less efficient (and much harder to program) than creating a new thread to read the data.

The solution to this problem is to use another thread. Say that we use this new thread in an applet: since this new thread is independent of the applet thread, it can block without hanging the applet. Of course, this causes a new problem: when this thread finally is able to read the data, this data must be returned to the applet thread. Let’s take a look at a possible implementation of a generic socket reader class that will read the socket from another thread:

import java.io.*;
import java.net.*;

public class AsyncReadSocket extends Thread {
    private Socket s;
    private StringBuffer result;

    public AsyncReadSocket(Socket s) {
        this.s = s;
        result = new StringBuffer();
    }

    public void run() {
        DataInputStream is = null;
        try {
            is = new DataInputStream(s.getInputStream());
        } catch (Exception e) {}
        while (true) {
            try {
                char c = is.readChar();
                appendResult(c);
            } catch (Exception e) {}
        }
    }

    // Get the string already read from the socket so far.
    // This method is used by the Applet thread to obtain the data
    // in a synchronous manner.
    public synchronized String getResult() {
        String retval = result.toString();
        result = new StringBuffer();
        return retval;
    }

    // Put new data into the buffer to be returned
    // by the getResult method.
    public synchronized void appendResult(char c) {
        result.append(c);
    }
}

Here we have a Thread class, AsyncReadSocket, whose run() method reads characters from a socket. Whenever it gets any characters, it adds them to the StringBuffer result. If this thread hangs while reading the socket, it has no effect on any other threads in the program. An applet can call the getResult() method to get any data that has been received by this new thread; if no data is available, the getResult() method returns an empty string. And if the applet thread is off doing some other tasks, this socket thread simply accumulates the characters for the applet thread. In other words, the socket thread stores the data it receives at any time, while the applet thread can call the getResult() method at any time without the worry of blocking or losing data. An actual run of the two threads may look like the diagram in Figure 3.2.

Possible time/location graph during a sample execution of the applet

Figure 3-2. Possible time/location graph during a sample execution of the applet

One of the attractions of threaded programming is that it is simple to write many small, independent tasks, and that’s just what we’ve done here. And since these small tasks are contained in one program, communication between the tasks (the threads) is as simple as communication between two methods in a single program. We just need a common reference somewhere that both threads can access. That “somewhere,” in this case, is the result instance variable.

Note that we could not have written this class correctly without using the synchronized keyword to protect the socket thread and the applet thread from accessing the result buffer at the same time. Otherwise, we would have had a race condition. Specifically, if the getResult() and appendResult() methods were not synchronized, we could see this behavior:

  1. The applet thread enters the getResult() method.

  2. The applet thread assigns retval to a new string created from the result StringBuffer.

  3. The socket thread returns from the readChar() method.

  4. The socket thread calls the appendResult() method to append the character to the result StringBuffer.

  5. The applet thread assigns result to a new (empty) StringBuffer.

The data that was appended to the StringBuffer in step 4 is now lost: it wasn’t retrieved by the applet thread at step 2, and the applet thread discards the old StringBuffer in step 5. Note that there is another race condition here: if two separate threads call the getResult() method at the same time, they could both get copies of the same data from the StringBuffer, and that data would be processed twice.

When all actions on the result variable are atomic, our race condition problem is solved. We need only ensure that the result variable is accessed only in methods that are synchronized.

At this point, we may have introduced more questions than answers. So before we continue, let’s try to answer some of these questions.

How does synchronizing two different methods prevent the two threads calling those methods from stepping on each other? As stated earlier, synchronizing a method has the effect of serializing access to the method. This means that it is not possible to execute the same method in another thread while the method is already running. However, the implementation of this mechanism is done by a lock that is assigned to the object itself. The reason another thread cannot execute the same method at the same time is that the method requires the lock that is already held by the first thread. If two different synchronized methods of the same object are called, they also behave in the same fashion because they both require the lock of the same object, and it is not possible for both methods to grab the lock at the same time. In other words, even if two or more methods are involved, they will never be run in parallel in separate threads. This is illustrated in Figure 3.3: when thread 1 and thread 2 attempt to acquire the same lock (L1), thread 2 must wait until thread 1 releases the lock before it can continue to execute.

Acquiring and releasing a lock

Figure 3-3. Acquiring and releasing a lock

The point to remember here is that the lock is based on a specific object and not on any particular method. Assume that we have two AsyncReadSocket objects called a and b that have been created in separate threads. One thread executes the a.getResult() method while the other thread executes the b.getResult() method. These two methods can execute in parallel because the call to a.get-Result() grabs the object lock associated with the instance variable a, and the call to b.getResult() grabs the object lock associated with the instance variable b. Since the two objects are different objects, two different locks are grabbed by the two threads: neither thread has to wait for the other.

Why do we need the appendResult() method? Couldn’t we simply put that code into the run() method and synchronize the run() method? We could do that, but the result would be disastrous. Every lock has an associated scope; that is, the amount of code for which the lock is valid. Synchronizing the run() method creates a scope that is too large and prevents other methods from being run at all.

The scope of the run() method is infinite, since the run() method executes an infinite loop. If both the run() method and getResult() method are synchronized, they cannot run in parallel in separate threads. Since the run() method has the task of opening the network socket and reading all the data from the socket until the connection is closed, it would need the object lock until the connection is closed. This means that while the connection is open, it would not be possible to execute the getResult() method. This is not the desired effect for a class that is supposed to read the data asynchronously.

How does a synchronized method behave in conjunction with a nonsynchronized method? Simply put, a synchronized method tries to grab the object lock, and a nonsynchronized method doesn’t. This means it is possible for many nonsynchronized methods to run in parallel with a synchronized method. Only one synchronized method runs at a time.

Synchronizing a method just means the lock is grabbed when that method executes. It is the developer’s responsibility to ensure that the correct methods are synchronized. Forgetting to synchronize a method can cause a race condition: if we had synchronized only the getResult() method of the AsyncReadSocket class and had forgotten to synchronize the appendResult() method, we would not have solved the race condition, since any thread could call the appendResult() method while the getResult() method was executing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset