Chapter 3. Sending and Receiving Data

Typically you use sockets because your program needs to provide information to, or use information provided by, another program. There is no magic: any programs that exchange information must agree on how that information will be encoded— represented as a sequence of bits—as well as which program sends what information when, and how the information received affects the behavior of the program. This agreement regarding the form and meaning of information exchanged over a communication channel is called a protocol;a protocolusedin implementing a particular application is an application protocol. In our echo example from the earlier chapters, the application protocol is trivial: neither the client’s nor the server’s behavior is affected by the contents of the messages they exchange. Because in most real applications the behavior of clients and servers depends upon the information they exchange, application protocols are usually somewhat more complicated.

The TCP/IP protocols transport bytes of user data without examining or modifying them. This allows applications great flexibility in how they encode their information for transmission. Most application protocols are defined in terms of discrete messages made up of sequences of fields. Each field contains a specific piece of information encoded as a sequence of bits. The application protocol specifies exactly how these sequences of bits are to be arranged by the sender and interpreted, or parsed, by the receiver so that the latter can extract the meaning of each field. About the only constraint imposed by TCP/IP is that information must be sent and received in chunks whose length in bits is a multiple of eight. So from now on we consider messages to be sequences of bytes. Given this, it may be helpful to think of a transmitted message as a sequence or array of numbers, each between 0 and 255. That corresponds to the range of binary values that can be encoded in 8 bits: 00000000 for zero, 00000001 for one, 00000010 for two, and so on, up to 11111111 for 255.

When you build a program to exchange information via sockets with other programs, typically one of two situations applies: either you are designing/writing the programs on both sides of the socket, in which case you are free to define the application protocol yourself, or you are implementing a protocol that someone else has already specified, perhaps a protocol standard. In either case, the basic principles of encoding and decoding different types of information as bytes “on the wire” are the same. By the way, everything in this chapter also applies if the “wire” is a file that is written by one program and then read by another.

Encoding Information

Let’s first consider the question of how simple values such as ints, longs, chars, and Strings can be sent and received via sockets. We have seen that bytes of information can be transmitted through a socket by writing them to an OutputStream (associated with a Socket) or encapsulating them in a DatagramPacket (which is then sent via a DatagramSocket). However, the only data types to which these operations can be applied are bytes and arrays of bytes. As a strongly typed language, Java requires that other types—int, String, and so on—be explicitly converted to byte arrays. Fortunately, the language has built-in facilities to help with such conversions. We saw one of these in Section 2.2.1: in TCPEchoClient.java, the getBytes() method of String, which converts the characters in a String instance to bytes in a standard way. Before considering the details of that kind of conversion, we first consider the representation of the most basic data types.

Primitive Integers

As we have already seen, TCP and UDP sockets give us the ability to send and receive sequences (arrays) of bytes, i.e., integer values in the range 0–255. Using that ability, we can encode the values of other (larger) primitive integer types. However, the sender and receiver have to agree on several things first. One is the size (in bytes) of each integer to be transmitted. For example, an int value in a Java program is represented as a 32-bit quantity. We can therefore transmit the value of any variable or constant of type int using four bytes. Values of type short, on the other hand, are represented using 16 bits and so only require two bytes to transmit, while longs are 64 bits or eight bytes.

Let’s consider how we would encode a sequence of four integer values: a byte, a short, an int, and a long, in that order, for transmission from sender to receiver. We need a total of 15 bytes: the first contains the value of the byte, the next two contain the value of the short, the next four encode the value of the int, and the last eight bytes contain the long value, as shown below:

Primitive Integers

Are we ready to go? Not quite. For types that require more than one byte, we have to answer the question of which order to send the bytes in. There are two obvious choices: start at the right end of the integer, with the least significant byte—so-called little-endian order—or at the left end, with the most significant byte—big-endian order. (Note that the ordering of bits within bytes is, fortunately, handled by the implementation in a standard way.) Consider the long value 123456787654321L. Its 64-bit representation (in hexadecimal) is 0x0000704885F926B1. If we transmit the bytes in big-endian order, the sequence of (decimal) byte values will look like this:

Primitive Integers

If we transmit them in little-endian order, the sequence will be:

Primitive Integers

The main point is that for any multibyte integer quantity, the sender and receiver need to agree on whether big-endian or little-endian order will be used.[1] If the sender were to use little-endian order to send the above integer, and the receiver were expecting big-endian, instead of the correct value, the receiver would interpret the transmitted eight-byte sequence as the value 12765164544669515776L.

One last detail on which the sender and receiver must agree: whether the numbers transmitted will be signed or unsigned. The four primitive integer types in Java are all signed; values are stored in two’s-complement representation, which is the usual way of representing signed numbers. When dealing with signed k-bit numbers, the two’s-complement representation of the negative integer −n, 1 ≤ n ≤ 2k−1, is the binary value of 2 k − n. The non-negative integer p, 0 ≤ p ≤ 2k−11, is encoded simply by the k-bit binary value of p. Thus, given k bits, we can represent values in the range 2k−1 1 through 2k−1 1 using two’s-complement. Note that the most significant bit (msb) tells whether the value is positive (msb = 0) or negative (msb = 1). On the other hand, a k-bit, unsigned integer can encode values in the range 0 through 2k 1 directly. So for example, the 32-bit value 0XFFFFFFFF (the all-ones value) when interpreted as a signed, two’s complement integer represents 1; when interpreted as an unsigned integer, it represents 4,294,967,295. Because Java does not support unsigned integer types, encoding and decoding unsigned numbers in Java requires a little care. Assume for now that we are dealing with signed integer types.

So how do we get the correct values into the byte array of the message? To allow you to see exactly what needs to happen, here’s how to do the encoding explicitly using “bit-diddling” (shifting and masking) operations. The program BruteForceCoding.java features a method encodeIntBigEndian() that can encode any value of one of the primitive types. Its arguments are the byte array into which the value is to be placed, the value to be encoded (represented as a long—which, as the largest type, can hold any of the other types), the offset in the array at which the value should start, and the size in bytes of the value to be written. If we encode at the sender, we must be able to decode at the receiver. BruteForceCoding also provides the decodeIntBigEndian() method for decoding a subset of a byte array into a Java long.

BruteForceCoding.java

 0 public class BruteForceCoding {
 1   private static byte byteVal = 101; // one hundred and one
 2   private static short shortVal = 10001; // ten thousand and one
 3   private static int intVal = 100000001; // one hundred million and one
 4   private static long longVal = 1000000000001L;// one trillion and one
 5
 6   private final static int BSIZE = Byte.SIZE / Byte.SIZE;
 7   private final static int SSIZE = Short.SIZE / Byte.SIZE;
 8   private final static int ISIZE = Integer.SIZE / Byte.SIZE;
 9   private final static int LSIZE = Long.SIZE / Byte.SIZE;
10
11   private final static int BYTEMASK = 0xFF; // 8 bits
12
13   public static String byteArrayToDecimalString(byte[] bArray) {
14     StringBuilder rtn = new StringBuilder();
15     for (byte b : bArray) {
16       rtn.append(b & BYTEMASK).append(" ");
17     }
18     return rtn.toString();
19   }
20
21   // Warning:   Untested preconditions (e.g., 0 <= size <= 8)
22   public static int encodeIntBigEndian(byte[] dst, long val, int offset, int size) {
23     for (int i = 0; i <  size; i++) {
24       dst[offset++] = (byte) (val >> ((size - i - 1) *  Byte.SIZE));
25     }
26     return offset;
27   }
28
29   // Warning:   Untested preconditions (e.g., 0 <= size <= 8)
30   public static long decodeIntBigEndian(byte[] val, int offset, int size) {
31     long rtn = 0;
32     for (int i = 0; i <  size; i++) {
33       rtn = (rtn << Byte.SIZE) | ((long) val[offset + i] & BYTEMASK);
34     }
35     return rtn;
36   }
37
38   public static void main(String[] args) {
39     byte[] message = new byte[BSIZE + SSIZE + ISIZE + LSIZE];
40     // Encode the fields in the target byte array
41     int offset = encodeIntBigEndian(message, byteVal, 0, BSIZE);
42     offset = encodeIntBigEndian(message, shortVal, offset, SSIZE);
43     offset = encodeIntBigEndian(message, intVal, offset, ISIZE);
44     encodeIntBigEndian(message, longVal, offset, LSIZE);
45     System.out.println("Encoded message: " + byteArrayToDecimalString(message));
46
47     // Decode several fields
48     long value = decodeIntBigEndian(message, BSIZE, SSIZE);
49     System.out.println("Decoded short = " +  value);
50     value = decodeIntBigEndian(message, BSIZE + SSIZE + ISIZE, LSIZE);
51     System.out.println("Decoded long = " +  value);
52
53     // Demonstrate dangers of conversion
54     offset = 4;
55     value = decodeIntBigEndian(message, offset, BSIZE);
56     System.out.println("Decoded value (offset " + offset + ", size " + BSIZE + ") = "
57         + value);
58     byte bVal = (byte) decodeIntBigEndian(message, offset, BSIZE);
59     System.out.println("Same value as byte = " + bVal);
60   }
61
62 }

BruteForceCoding.java

  1. Data items to encode: lines 1–4

  2. Numbers of bytes in Java integer primitives: lines 6–9

  3. byteArrayToDecimalString(): lines 13–19

    This method prints each byte from the given array as an unsigned decimal value. BYTEMASK keeps the byte value from being sign-extended when it is converted to an int in the call to append(), thus rendering it as an unsigned integer.

  4. encodeIntBigEndian(): lines 22–27

    The right-hand side of the assignment statement first shifts the value to the right so the byte we are interested in is in the low-order eight bits. The resulting value is then cast to the type byte, which throws away all but the low-order eight bits, and placed in the array at the appropriate location. This is iterated over size bytes of the given value, val. The new offset is returned so we need not keep track of it.

  5. decodeIntBigEndian(): lines 30–36

    Iterate over size bytes of the given array, accumulating the result in a long, which is shifted left at each iteration.

  6. Demonstrate methods: lines 38–60

    • Prepare array to receive series of integers: line 39

    • Encode items: lines 40–44

      The byte, short, int, and long are encoded into the array in the sequence described earlier.

    • Print contents of encoded array: line 45

    • Decode several fields from encoded byte array: lines 47–51

      Output should show the decoded values equal to the original constants.

    • Conversion problems: lines 53–59

      At offset 4, the byte value is 245 (decimal); however, when read as a signed byte value, it should be 11 (recall two’s-complement representation of signed integers). If we place the return value into a long, it simply becomes the last byte of a long, producing a value of 245. Placing the return value into a byte yields a value of 11. Which answer is correct depends on your application. If you expect a signed value from decoding N bytes, you must place the (long) result in a primitive integer type that uses exactly N bytes. If you expect an unsigned value from decoding N bytes, you must place the results in a primitive integer type that uses at least N + 1 bytes.

Note that there are several preconditions we might consider testing at the beginning of encodeIntBigEndian() and decodeIntBigEndian(), such as 0 ≤ size ≤ 8 and dstnull. Can you name any others?

Running the program produces output showing the following (decimal) byte values:

Conversion problems:

As you can see, the brute-force method requires the programmer to do quite a bit of work: computing and naming the offset and size of each value, and invoking the encoding routine with the appropriate arguments. It would be even worse if the encodeIntBigEndian() method were not factored out as a separate method. For that reason, it is not the recommended approach, because Java provides some built-in mechanisms that are easier to use. Note that it does have the advantage that, in addition to the standard Java integer sizes, encodeIntegerBigEndian() works with any size integer from 1 to 8 bytes—for example, you can encode a seven-byte integer if you like.

A relatively easy way to construct the message in this example is to use the Data-OutputStream and ByteArrayOutputStream classes. The DataOutputStream allows you to write primitive types like the integers we’ve been discussing to a stream: it provides writeByte(), writeShort(), writeInt(), and writeLong() methods, which take an integer value and write it to the stream in the appropriately sized big-endian two’s-complement representation. The ByteArrayOutputStream class takes the sequence of bytes written to a stream and converts it to a byte array. The code for building our message looks like this:

ByteArrayOutputStream buf = new ByteArrayOutputStream();
DataOutputStream out = new DataOutputStream(buf);
out.writeByte(byteVal);
out.writeShort(shortVal);
out.writeInt(intVal);
out.writeLong(longVal);
out.flush();
byte[] msg = buf.toByteArray();

You may want to run this code to convince yourself that it produces the same output as BruteForceEncoding.java.

So much for the sending side. How does the receiver recover the transmitted values? As you might expect, there are input analogues for the output facilities we used, namely DataInput-Stream and ByteArrayInputStream. We’ll show an example of their use later, when we discuss how to parse incoming messages. Also, in Chapter 5, we’ll see another way of converting primitive types to byte sequences, using the ByteBuffer class.

Finally, essentially everything in this subsection applies also to the BigInteger class, which supports arbitrarily large integers. As with the primitive integer types, sender and receiver have to agree on a specific size (number of bytes) to represent the value. However, this defeats the purpose of using a BigInteger, which can be arbitrarily large. One approach is to use length-based framing, which we’ll see in Section 3.3.

Strings and Text

Old-fashioned text —strings of printable (displayable) characters—is perhaps the most common way to represent information. Text is convenient because humans are accustomed to dealing with all kinds of information represented as strings of characters in books, newspapers, and on computer displays. Thus, once we know how to encode text for transmission, we can send almost any other kind of data: first represent it as text, then encode the text. Obviously we can represent numbers and boolean values as Strings—for example "123478962", "6.02e23", "true", "false". And we’ve already seen that a string can be converted to a byte array by calling the getBytes() method (see TCPEchoClient.java). Alas, there is more to it than that.

To better understand what’s going on, we first need to consider that text is made up of symbols or characters. Infactevery String instance corresponds to a sequence (array) of characters (type char[ ]). A char value in Java is represented internally as an integer. For example, the character "a", that is, the symbol for the letter “a”, corresponds to the integer 97. The character "X" corresponds to 88, and the symbol "!" (exclamation mark) corresponds to 33.

A mapping between a set of symbols and a set of integers is called a coded character set. You may have heard of the coded character set known as ASCIIAmerican Standard Code for Information Interchange. ASCII maps the letters of the English alphabet, digits, punctuation and some other special (non-printable) symbols to integers between 0 and 127. It has been used for data transmission since the 1960s, and is used extensively in application protocols such as HTTP (the protocol used for the World Wide Web), even today. However, because it omits symbols used by many languages other than English, it is less than ideal for developing applications and protocols designed to function in today’s global economy.

Java therefore uses an international standard coded character set called Unicode to represent values of type char and String. Unicode maps symbols from “most of the languages and symbol systems of the world” [19] to integers between 0 and 65,535, and is much better suited for internationalized programs. For example, the Japanese Hiragana symbol for the syllable “o” maps to the integer 12,362. Unicode includes ASCII: each symbol defined by ASCII maps to the same integer in Unicode as it does in ASCII. This provides a degree of backward compatibility between ASCII and Unicode.

So sender and receiver have to agree on a mapping from symbols to integers in order to communicate using text messages. Is that all they need to agree on? It depends. For a small set of characters with no integer value larger than 255, nothing more is needed because each character can be encoded as a single byte. For a code that may use larger integer values that require more than a single byte to represent, there is more than one way to encode those values on the wire. Thus, sender and receiver need to agree on how those integers will be represented as byte sequences—that is, an encoding scheme. The combination of a coded character set and a character encoding scheme is called a charset (see RFC 2278). It is possible to define your own charset, but there is hardly ever a reason to do so. A large number of different standardized charsets are in use around the world. Java provides support for the use of arbitrary charsets, and every implementation is required to support at least the following: US-ASCII (another name for ASCII), ISO-8859-1, UTF-8, UTF-16BE, UTF-16LE, UTF-16.

When you invoke the getBytes() method of a String instance, it returns a byte array containing the String encoded according to the default charset for the platform. On many platforms the default charset is UTF-8; however, in localities that make frequent use of characters outside the ASCII charset, it may be something different. To ensure that a string is encoded using a particular charset, you simply supply the name of the charset as a (String) argument to the getBytes() method. The resulting byte array contains the representation of the string in the given encoding. (Note that in the example TCP Echo Client/Server from Section 2.2.1 the encoding was irrelevant because the server did not interpret the received data at all.)

For example, if you call "Test!".getBytes() on the platform on which this book was written, you get back the encoding according to UTF-8: If you call "Test!".getBytes("UTF-16BE"), on

84

101

115

116

33

the other hand, you get the following array: In this case each value is encoded as a two-byte

0

84

0

101

0

115

0

116

0

33

sequence, with the high-order byte first. From "Test!".getBytes("IBM037"), the result is:

227

133

162

163

90

The moral of the story is that sender and receiver must agree on the representation for strings of text. The easiest way for them to do that is to simply specify one of the standard charsets.

As we have seen, it is possible to write Strings to an OutputStream by first converting them individually to bytes and then writing the result to the stream. That method requires that the encoding be specified on every call to getBytes(). Later in the chapter we’ll see a way to simply specify the encoding once when constructing text messages.

Bit-Diddling: Encoding Booleans

Bitmaps are a very compact way to encode boolean information, which is often used in protocols. The idea of a bitmap is that each of the bits of an integer type can encode one boolean value— typically with 0 representing false, and 1 representing true. To be able to manipulate bitmaps, you need to know how to set and clear individual bits using Java’s “bit-diddling” operations. A mask is an integer value that has one or more specific bits set to 1, and all others cleared (i.e., 0). We’ll deal here mainly with int-sized bitmaps and masks (32 bits), but everything we say applies to other integer types as well.

Let’s number the bits of a value of type int from 0 to 31, where bit 0 is the least significant bit. In general, the int value that has a 1 in bit position i, and a zero in all other bit positions, is just 2i. So bit 5 is represented by 32, bit 12 by 4096, etc. Here are some example mask declarations:

final int BIT5 = (1<<5);
final int BIT7 = 0x80;
final int BITS2AND3 = 12;   // 8+4
int bitmap = 1234567;

To set a particular bit in an int variable, combine it with the mask for that bit using the bitwise-OR operation (|):

bitmap |= BIT5;
// bit 5 is now one

To clear a particular bit, bitwise-AND it with the bitwise complement of the mask for that bit (which has ones everywhere except the particular bit, which is zero). The bitwise-AND operation in Java is &, while the bitwise-complement operator is ~.

bitmap &= ~BIT7;
// bit 7 is now zero

You can set and clear multiple bits at once by OR-ing together the corresponding masks:

// clear bits 2, 3 and 5
bitmap &= ~(BITS2AND3|BIT5);

To test whether a bit is set, compare the result of the bitwise-AND of the mask and the value with zero:

boolean bit6Set = (bitmap & (1<<6)) != 0;

Composing I/O Streams

Java’s stream classes can be composed to provide powerful capabilities. For example, we can wrap the OutputStream of a Socket instance in a BufferedOutputStream instance to improve performance by buffering bytes temporarily and flushing them to the underlying channel all at once. We can then wrap that instance in a DataOutputStream to send primitive data types. We would code this composition as follows:

Socket socket = new Socket(server, port);
DataOutputStream out = new DataOutputStream(
    new BufferedOutputStream(socket.getOutputStream()));

Figure 3-1 demonstrates this composition. Here, we write our primitive data values, one by one, to DataOutputStream, which writes the binary data to BufferedOutputStream, which buffers the data from the three writes and then writes once to the socket OutputStream, which controls writing to the network. We create a corresponding composition for the InputStream on the other endpoint to efficiently receive primitive data types.

Stream composition.

Figure 3.1. Stream composition.

A complete description of the Java I/O API is beyond the scope of this text; however, Table 3-1 provides a list of some of the relevant Java I/O classes as a starting point for exploiting its capabilities.

Table 3.1. Java I/O Classes

I/O Class

Function

Buffered[Input/Output]Stream

Performs buffering for I/O optimization.

Checked[Input/Output]Stream

Maintains a checksum on data.

Cipher[Input/Output]Stream

Encrypt/Decrypt data.

Data[Input/Output]Stream

Handles read/write for primitive date types.

Digest[Input/Output]Stream

Maintains a digest on data.

GZIP[Input/Output]Stream

De/compresses a byte stream in GZIP format.

Object[Input/Output]Stream

Handles read/write objects and primitive data types.

PushbackInputStream

Allows a byte or bytes to be “unread.”

PrintOutputStream

Prints string representation of data type.

Zip[Input/Output]Stream

De/compresses a byte stream in ZIP format.

Framing and Parsing

Converting data to wire format is, of course, only half the story; the original information must be recovered at the receiver from the transmitted sequence of bytes. Application protocols typically deal with discrete messages, which are viewed as collections of fields. Framing refers to the problem of enabling the receiver to locate the beginning and end of a message. Whether information is encoded as text, as multibyte binary numbers, or as some combination of the two, the application protocol must specify how the receiver of a message can determine when it has received all of the message.

Of course, if a complete message is sent as the payload of a DatagramPacket, the problem is trivial: the payload of the DatagramPacket has a definite length, and the receiver knows exactly where the message ends. For messages sent over TCP sockets, however, the situation can be more complicated because TCP has no notion of message boundaries. If the fields in a message all have fixed sizes and the message is made up of a fixed number of fields, then the size of the message is known in advance and the receiver can simply read the expected number of bytes into a byte[ ] buffer. This technique was used in TCPEchoClient.java, where we knew the number of bytes to expect from the server. However, when the message can vary in length—for example, if it contains some variable-length arbitrary text strings—we do not know beforehand how many bytes to read.

If a receiver tries to receive more bytes from the socket than were in the message, one of two things can happen. If no other message is in the channel, the receiver will block and be prevented from processing the message; if the sender is also blocked waiting for a reply, the result will be deadlock. On the other hand, if another message is in the channel, the receiver may read some or all of it as part of the first message, leading to protocol errors. Therefore framing is an important consideration when using TCP sockets.

Note that some of the same considerations apply to finding the boundaries of the individual fields of the message: the receiver needs to know where one ends and another begins. Thus, pretty much everything we say here about framing messages also applies to fields. However, it is simplest, and also leads to the cleanest code, if you deal with these two problems separately: first locate the end of the message, then parse the message as a whole. Here we focus on framing complete messages.

Two general techniques enable a receiver to unambiguously find the end of the message:

  • Delimiter-based: The end of the message is indicated by a unique marker, an explicit byte sequence that the sender transmits immediately following the data. The marker must be known not to occur in the data.

  • Explicit length: The variable-length field or message is preceded by a (fixed-size) length field that tells how many bytes it contains.

A special case of the delimiter-based method can be used for the last message sent on a TCP connection: the sender simply closes the sending side of the connection (using shutdownOutput() or close()) after sending the message. After the receiver reads the last byte of the message, it receives an end-of-stream indication (i.e., read() returns 1), and thus can tell that it has reached the end of the message.

The delimiter-based approach is often used with messages encoded as text: A particular character or sequence of characters is defined to mark the end of the message. The receiver simply scans the input (as characters) looking for the delimiter sequence; it returns the character string preceding the delimiter. The drawback is that the message itself must not contain the delimiter, otherwise the receiver will find the end of the message prematurely. With a delimiter-based framing method, the sender is responsible for ensuring that this precondition is satisfied. Fortunately so-called stuffing techniques allow delimiters that occur naturally in the message to be modified so the receiver will not recognize them as such; as it scans for the delimiter, it also recognizes the modified delimiters and restores them in the output message so it matches the original. The downside of such techniques is that both sender and receiver have to scan the message.

The length-based approach is simpler, but requires a known upper bound on the size of the message. The sender first determines the length of the message, encodes it as an integer, and prefixes the result to the message. The upper bound on the message length determines the number of bytes required to encode the length: one byte if messages always contain fewer than 256 bytes, two bytes if they are always shorter than 65,536 bytes, and so on.

In order to demonstrate these techniques, we introduce the interface Framer, which is defined below. It has two methods: frameMsg() adds framing information and outputs a given message to a given stream, while nextMsg() scans a given stream, extracting the next message.

Framer.java

0 import java.io.IOException;
1 import java.io.OutputStream;
2
3 public interface Framer {
4   void frameMsg(byte[] message, OutputStream out) throws IOException;
5   byte[] nextMsg() throws IOException;
6 }

Framer.java

The class DelimFramer.java implements delimiter-based framing using the “newline” character (" ", byte value 10). The frameMethod() method does not do stuffing, but simply throws an exception if the byte sequence to be framed contains the delimiter. (Extending the method to do stuffing is left as an excercise.) The nextMsg() method scans the stream until it reads the delimiter, then returns everything up to the delimiter; NULL is returned if the stream is empty. If some bytes of a message are accumulated and the stream ends without finding a delimiter, an exception is thrown to indicate a framing error.

DelimFramer.java

 0 import java.io.ByteArrayOutputStream;
 1 import java.io.EOFException;
 2 import java.io.IOException;
 3 import java.io.InputStream;
 4 import java.io.OutputStream;
 5
 6 public class DelimFramer implements Framer {
 7
 8   private InputStream in;   // data source
 9   private static final byte DELIMITER = "
"; // message delimiter
10

11   public DelimFramer(InputStream in) {
12     this.in = in;
13   }
14
15   public void frameMsg(byte[] message, OutputStream out) throws IOException {
16     // ensure that the message does not contain the delimiter
17     for (byte b : message) {
18       if (b == DELIMITER) {
19         throw new IOException("Message contains delimiter");
20       }
21     }
22     out.write(message);
23     out.write(DELIMITER);
24     out.flush();
25   }
26
27   public byte[] nextMsg() throws IOException {
28     ByteArrayOutputStream messageBuffer = new ByteArrayOutputStream();
29     int nextByte;
30
31     // fetch bytes until find delimiter
32     while ((nextByte = in.read()) != DELIMITER) {
33       if (nextByte == -1) { // end of stream?
34         if (messageBuffer.size() == 0) { // if no byte read
35           return null;
36       } else { // if bytes followed by end of stream: framing error
37         throw new EOFException("Non-empty message without delimiter");
38       }
39     }
40     messageBuffer.write(nextByte); // write byte to buffer
41   }
42
43   return messageBuffer.toByteArray();
44  }
45 }

DelimFramer.java

  1. Constructor: lines 11–13

    The input stream from which messages are to be extracted is given as an argument.

  2. frameMsg()adds framing information: lines 15–25

    • Verify well-formedness: lines 17–21

      Check that the given message does not contain the delimiter; if so, throw an exception.

    • Write message: line 22

      Output the framed message to the stream

    • Write delimiter: line 23

  3. nextMsg()extracts messages from input: lines 27–44

    • Read each byte in the stream until the delimiter is found: line 32

    • Handle end of stream: lines 33–39

      If the end of stream occurs before finding the delimiter, throw an exception if any bytes have been read since construction of the framer or the last delimiter; otherwise return NULL to indicate that all messages have been received.

    • Write non-delimiter byte to message buffer: line 40

    • Return contents of message buffer as byte array: line 43

There’s a limitation to our delimiting framer: it does not support multibyte delimiters. We leave fixing this as an exercise for the reader.

The class LengthFramer.java implements length-based framing for messages up to 65,535 (2161) bytes in length. The sender determines the length of the given message and writes it to the output stream as a two-byte, big-endian integer, followed by the complete message. On the receiving side, we use a DataInputStream to be able to read the length as an integer; the readFully() method blocks until the given array is completely full, which is exactly what we need here. Note that, with this framing method, the sender does not have to inspect the content of the message being framed; it needs only to check that the message does not exceed the length limit.

LengthFramer.java

 0 import java.io.DataInputStream;
 1 import java.io.EOFException;
 2 import java.io.IOException;
 3 import java.io.InputStream;
 4 import java.io.OutputStream;
 5
 6 public class LengthFramer implements Framer {
 7   public static final int MAXMESSAGELENGTH = 65535;
 8   public static final int BYTEMASK = 0xff;
 9   public static final int SHORTMASK = 0xffff;
10   public static final int BYTESHIFT = 8;
11
12   private DataInputStream in; // wrapper for data I/O
13
14   public LengthFramer(InputStream in) throws IOException {
15     this.in = new DataInputStream(in);
16   }
17
18   public void frameMsg(byte[] message, OutputStream out) throws IOException {
19     if (message.length > MAXMESSAGELENGTH) {
20       throw new IOException('message too long'),
21     }
22     // write length prefix
23     out.write((message.length >> BYTESHIFT) & BYTEMASK);
24     out.write(message.length & BYTEMASK);
25     // write message
26     out.write(message);
27     out.flush();
28   }
29
30   public byte[] nextMsg() throws IOException {
31     int length;
32     try {
33       length = in.readUnsignedShort(); // read 2 bytes
34     } catch (EOFException e) { // no (or 1 byte) message
35       return null;
36     }
37     // 0 <= length <= 65535
38     byte[] msg = new byte[length];
39     in.readFully(msg); // if exception, it's a framing error.
40     return msg;
41   }
42 }

LengthFramer.java

  1. Constructor: lines 14–16

    Take the input stream source for framed messages and wrap it in a DataInputStream.

  2. frameMsg()adds framing information: lines 18–28

    • Verify length: lines 19–21

      Because we use a two-byte length field, the length cannot exceed 65,535. (Note that this value is too big to store in a short, sowewriteitabyteatatime.)

    • Output length field: lines 23–24

      Output the message bytes prefixed by the length (unsigned short).

    • Output message: line 26

  3. nextMsg()extracts next frame from input: lines 30–41

    • Read the prefix length: lines 32–36

      The readUnsignedShort() method reads two bytes, interprets them as a big-endian integer, and returns their value as an int.

    • Read the specified number of bytes: lines 38–39

      The readfully() method blocks until enough bytes to fill the given array have been returned.

    • Return bytes as message: line 40

Java-Specific Encodings

When you use sockets, generally either you are building the programs on both ends of the communication channel—in which case you also have complete control over the protocol—or you are communicating using a given protocol, which you have to implement. When you know that (i) both ends of the communication will be implemented in Java, and (ii) you have complete control over the protocol, you can make use of Java’s built-in facilities like the Serializable interface or the Remote Method Invocation (RMI) facility. RMI lets you invoke methods on different Java virtual machines, hiding all the messy details of argument encoding and decoding. Serialization handles conversion of actual Java objects to byte sequences for you, so you can transfer actual instances of Java objects between virtual machines.

These capabilities might seem like communication Nirvana, but in reality they are not always the best solution, for several reasons. First, because they are very general facilities, they are not the most efficient in terms of communication overhead. For example, the serialized form of an object generally includes information that is meaningless outside the context of the Java Virtual Machine (JVM). Second, Serializable and Externalizable cannot be used when a different wire format has already been specified—for example, by a standardized protocol. And finally, custom-designed classes have to provide their own implementations of the serialization interfaces, and this is not easy to get right. Again, there are certainly situations where these built-in facilities are useful; but sometimes it is simpler, easier, or more efficient to “roll your own.”

Constructing and Parsing Protocol Messages

We close this chapter with a simple example to illustrate some techniques you might use to implement a protocol specified by someone else. The example is a simple “voting” protocol as shown in Figure 3-2. Here a client sends a request message to a server; the message contains a candidate ID, which is an integer between 0 and 1000. Two types of requests are supported. An inquiry asks the server how many votes have been cast for the given candidate. The server sends back a response message containing the original candidate ID and the vote total (as of the time the request was received) for that candidate. A voting request actually casts a vote for the indicated candidate. The server again responds with a message containing the candidate ID andthevotetotal(whichnowincludesthevotejustcast).

Voting protocol.

Figure 3.2. Voting protocol.

In implementing a protocol, it is helpful to define a class to contain the information carried in a message. The class provides methods for manipulating the fields of the message—while maintaining the invariants that are supposed to hold among those fields. For our simple example, the messages sent by client and server are very similar. The only difference is that the messages sent by the server contain the vote count and a flag indicating that they are responses (not requests). In this case, we can get away with a single class for both kinds of messages. The VoteMsg.java class shows the basic information in each message:

  • a booleanisInquiry, which is true if the requested transaction is an inquiry (and false if it is an actual vote);

  • a booleanisResponse indicating whether the message is a response (sent by the server) or request;

  • an integercandidateID that identifies the candidate;

  • a longvoteCount indicating the vote total for the requested candidate

    The class maintains the following invariants among the fields:

  • candidateID is in the range 0–1000.

  • voteCount is only nonzero in response messages (isResponse is true).

  • voteCount is non-negative.

VoteMsg.java

 0 public class VoteMsg {
 1   private boolean isInquiry; // true if inquiry; false if vote
 2   private boolean isResponse;// true if response from server
 3   private int candidateID;   // in [0,1000]
 4   private long voteCount;    // nonzero only in response
 5
 6   public static final int MAX_CANDIDATE_ID = 1000;
 7
 8   public VoteMsg(boolean isResponse, boolean isInquiry, int candidateID, long voteCount)
 9       throws IllegalArgumentException {
10     // check invariants
11     if (voteCount != 0 && !isResponse) {
12       throw new IllegalArgumentException('Request vote count must be zero'),
13     }
14     if (candidateID < 0 ||  candidateID > MAX_CANDIDATE_ID) {
15       throw new IllegalArgumentException('Bad Candidate ID: ' + candidateID);
16     }
17     if (voteCount < 0) {
18        throw new IllegalArgumentException('Total must be >= zero'),
19     }
20     this.candidateID = candidateID;
21     this.isResponse = isResponse;
22     this.isInquiry = isInquiry;
23     this.voteCount = voteCount;
24   }
25
26   public void setInquiry(boolean isInquiry) {
27     this.isInquiry = isInquiry;
28   }
29
30   public void setResponse(boolean isResponse) {
31     this.isResponse = isResponse;
32   }
33
34   public boolean isInquiry() {
35     return isInquiry;
36   }
37
38   public boolean isResponse() {
39     return isResponse;
40   }
41
42   public void setCandidateID(int candidateID) throws IllegalArgumentException {
43     if (candidateID < 0 ||  candidateID > MAX_CANDIDATE_ID) {
44       throw new IllegalArgumentException('Bad Candidate ID: ' + candidateID);
45     }
46     this.candidateID = candidateID;
47   }
48
49   public int getCandidateID() {
50     return candidateID;
51   }
52
53   public void setVoteCount(long count) {
54     if ((count != 0 && !isResponse) || count < 0) {
55       throw new IllegalArgumentException('Bad vote count'),
56     }
57   voteCount = count;
58   }
59
60   public long getVoteCount() {
61     return voteCount;
62   }
63
64   public String toString() {
65     String res = (isInquiry ? 'inquiry' : 'vote') + ' for candidate ' + candidateID;
66     if (isResponse) {
67       res = 'response to ' + res + ' who now has ' + voteCount + ' vote(s)';
68     }
69     return res;
70   }
71 }

VoteMsg.java

Now that we have a Java representation of a vote message, we need some way to encode and decode according to some protocol. A VoteMsgCoder provides the methods for vote message serialization and deserialization.

VoteMsgCoder.java

0 import java.io.IOException;
1
2 public interface VoteMsgCoder {
3   byte[] toWire(VoteMsg msg) throws IOException;
4   VoteMsg fromWire(byte[] input) throws IOException;
5 }

VoteMsgCoder.java

The toWire() method converts the vote message to a sequence of bytes according to a particular protocol, and the fromWire() method parses a given sequence of bytes according to the same protocol and constructs an instance of the message class.

To illustrate the different methods of encoding information, we present two implementations of VoteMsgCoder, one using a text-based encoding and one using a binary encoding. If you were guaranteed a single encoding that would never change, the toWire() and fromWire() methods could be specified as part of VoteMsg. Our purpose here is to emphasize that the abstract representation is independent of the details of the encoding.

Text-Based Representation

We first present a version in which messages are encoded as text. The protocol specifies that the text be encoded using the US-ASCII charset. The message begins with a so-called “magic string”—a sequence of characters that allows a recipient to quickly recognize the message as a Voting protocol message, as opposed to random garbage that happened to arrive over the network. The Vote/Inquiry boolean is encoded with the character ‘v’ for a vote or ‘i’ for an inquiry. The message’s status as a response is indicated by the presence of the character ‘R’. Then comes the candidate ID, followed by the vote count, both encoded as decimal strings. The VoteMsgTextCoder provides a text-based encoding of VoteMsg.

VoteMsgTextCoder.java

 0 import java.io.ByteArrayInputStream;
 1 import java.io.IOException;
 2 import java.io.InputStreamReader;
 3 import java.util.Scanner;
 4
 5 public class VoteMsgTextCoder implements VoteMsgCoder {
 6   /*
 7    * Wire Format "VOTEPROTO" <"v"|"i"> [<RESPFLAG>] <CANDIDATE> [<VOTECNT>]
 8    * Charset is fixed by the wire format.
 9    */
10
11    // Manifest constants for encoding
12    public static final String MAGIC = "Voting";
13    public static final String VOTESTR = "v";
14    public static final String INQSTR = "i";
15    public static final String RESPONSESTR = "R";
16
17    public static final String CHARSETNAME = "US-ASCII";
18    public static final String DELIMSTR = " ";
19    public static final int MAX_WIRE_LENGTH = 2000;
20
21    public byte[] toWire(VoteMsg msg) throws IOException {
22      String msgString = MAGIC + DELIMSTR + (msg.isInquiry() ? INQSTR : VOTESTR)
23          + DELIMSTR + (msg.isResponse() ? RESPONSESTR + DELIMSTR : "")
24          + Integer.toString(msg.getCandidateID()) + DELIMSTR
25          + Long.toString(msg.getVoteCount());
26      byte data[] = msgString.getBytes(CHARSETNAME);
27      return data;
28    }
29
30    public VoteMsg fromWire(byte[] message) throws IOException {
31      ByteArrayInputStream msgStream = new ByteArrayInputStream(message);
32      Scanner s = new Scanner(new InputStreamReader(msgStream, CHARSETNAME));
33      boolean isInquiry;
34      boolean isResponse;
35      int candidateID;
36      long voteCount;
37      String token;
38
39      try {
40        token = s.next();
41        if (!token.equals(MAGIC)) {
42          throw new IOException("Bad magic string: " + token);
43        }
44        token = s.next();
45        if (token.equals(VOTESTR)) {
46          isInquiry = false;
47        } else if (!token.equals(INQSTR)) {
48         throw new IOException("Bad vote/inq indicator: " + token);
49        } else {
50         isInquiry = true;
51        }
52
53        token = s.next();
54        if (token.equals(RESPONSESTR)) {
55          isResponse = true;
56          token = s.next();
57        } else {
58          isResponse = false;
59        }
60        // Current token is candidateID
61        // Note: isResponse now valid
62        candidateID = Integer.parseInt(token);
63        if (isResponse) {
64          token = s.next();
65          voteCount = Long.parseLong(token);
66        } else {
67          voteCount = 0;
68        }
69      } catch (IOException ioe) {
70        throw new IOException("Parse error...");
71      }
72      return new VoteMsg(isResponse, isInquiry, candidateID, voteCount);
73    }
74 }

VoteMsgTextCoder.java

The toWire() method simply constructs a string containing all the fields of the message, separated by white space. The fromWire() method first looks for the “Magic” string; if it is not the first thing in the message, it throws an exception. This illustrates a very important point about implementing protocols: never assume anything about any input from the network. Your program must always be prepared for any possible inputs, and handle them gracefully. In this case, the fromWire() method throws an exception if the expected string is not present. Otherwise, it gets the fields token by token, using the Scanner instance. Note that the number of fields in the message depends on whether it is a request (sent by the client) or response (sent by the server). fromWire() throws an exception if the input ends prematurely or is otherwise malformed.

Binary Representation

Next we present a different way to encode the Voting protocol message. In contrast with the text-based format, the binary format uses fixed-size messages. Each message begins with a one-byte field that contains the “magic” value 010101 in its high-order six bits. This little bit of redundancy provides the receiver with a small degree of assurance that it is receiving a proper voting message. The two low-order bits of the first byte encode the two booleans. The second byte of the message always contains zeros, and the third and fourth bytes contain the candidateID. The final eight bytes of a response message (only) contain the vote count.

VoteMsgBinCoder.java

 0 import java.io.ByteArrayInputStream;
 1 import java.io.ByteArrayOutputStream;
 2 import java.io.DataInputStream;
 3 import java.io.DataOutputStream;
 4 import java.io.IOException;
 5
 6 /* Wire Format
 7  *                                1  1  1  1  1  1
 8  *  0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
 9  * +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
10  * |     Magic       |Flags|        ZERO           |
11  * +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
12  * |                  Candidate ID                 |
13  * +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
14  * |                                               |
15  * |         Vote Count (only in response)         |
16  * |                                               |
17  * |                                               |
18  * +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
19  */
20 public class VoteMsgBinCoder implements VoteMsgCoder {
21
22   // manifest constants for encoding
23   public static final int MIN_WIRE_LENGTH = 4;
24   public static final int MAX_WIRE_LENGTH = 16;
25   public static final int MAGIC = 0x5400;
26   public static final int MAGIC_MASK = 0xfc00;
27   public static final int MAGIC_SHIFT = 8;
28   public static final int RESPONSE_FLAG = 0x0200;
29   public static final int INQUIRE_FLAG =  0x0100;
30
31   public byte[] toWire(VoteMsg msg) throws IOException {
32     ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
33     DataOutputStream out = new DataOutputStream(byteStream); // converts ints
34
35     short magicAndFlags = MAGIC;
36     if (msg.isInquiry()) {
37       magicAndFlags |= INQUIRE_FLAG;
38     }
39     if (msg.isResponse()) {
40       magicAndFlags |= RESPONSE_FLAG;
41     }
42     out.writeShort(magicAndFlags);
43     // We know the candidate ID will fit in a short: it's > 0 && <  1000
44     out.writeShort((short) msg.getCandidateID());
45     if (msg.isResponse()) {
46       out.writeLong(msg.getVoteCount());
47     }
48     out.flush();
49     byte[] data = byteStream.toByteArray();
50     return data;
51   }
52
53   public VoteMsg fromWire(byte[] input) throws IOException {
54     // sanity checks
55     if (input.length < MIN_WIRE_LENGTH) {
56       throw new IOException('Runt message'),
57     }
58     ByteArrayInputStream bs = new ByteArrayInputStream(input);
59     DataInputStream in = new DataInputStream(bs);
60     int magic = in.readShort();
61     if ((magic & MAGIC_MASK) != MAGIC) {
62       throw new IOException('Bad Magic #: ' +
63                             ((magic & MAGIC_MASK) >> MAGIC_SHIFT));
64     }
65     boolean resp = ((magic & RESPONSE_FLAG) != 0);
66     boolean inq = ((magic & INQUIRE_FLAG) != 0);
67     int candidateID = in.readShort();
68     if (candidateID < 0 ||  candidateID > 1000) {
69       throw new IOException('Bad candidate ID: ' + candidateID);
70     }
71     long count = 0;
72     if (resp) {
73       count = in.readLong();
74       if (count < 0) {
75         throw new IOException('Bad vote count: ' + count);
76       }
77     }
78     // Ignore any extra bytes
79     return new VoteMsg(resp, inq, candidateID, count);
80   }
81 }

VoteMsgBinCoder.java

As in Section 3.1.1, we create a ByteArrayOutputStream and wrap it in a DataOutputStream to receive the result. The encoding method takes advantage of the fact that the high-order two bytes of a valid candidateID are always zero. Note also the use of bitwise-or operations to encode the booleans using a single bit each.

Sending and Receiving

Sending a message over a stream is as simple as creating it, calling toWire(), adding appropriate framing information, and writing it. Receiving, of course, does things in the opposite order. This approach applies to TCP; in UDP explicit framing is not necessary, because message boundaries are preserved. To demonstrate this, consider a vote server that 1) maintains a mapping of candidate IDs to number of votes, 2) counts submitted votes, and 3) responds to inquiries and votes with the current count for the specified candidate. We begin by implementing a service for use by vote servers. When a vote server receives a vote message, it handles the request by calling the handleRequest() method of VoteService.

VoteService.java

0 import java.util.HashMap;
1 import java.util.Map;
2
3 public class VoteService {
4
5   // Map of candidates to number of votes
6   private Map<Integer, Long> results = new HashMap<Integer, Long>();
7
8   public VoteMsg handleRequest(VoteMsg msg) {
9     if (msg.isResponse()) { // If response, just send it back

10     return msg;
11   }
12   msg.setResponse(true); // Make message a response
13   // Get candidate ID and vote count
14   int candidate = msg.getCandidateID();
15   Long count = results.get(candidate);
16   if (count == null) {
17     count = 0L; // Candidate does not exist
18   }
19   if (!msg.isInquiry()) {
20     results.put(candidate, ++count); // If vote, increment count
21   }
22   msg.setVoteCount(count);
23   return msg;
24  }
25 }

VoteService.java

  1. Create map of candidate ID to vote count: line 6

    For inquiries, the given candidate ID is used to look up the candidate’s vote count in the map. For votes, the incremented vote count is stored back in the map.

  2. handleRequest(): lines 8–24

    • Return a response: lines 9–12

      If the vote message is already a response, we send it back without processing or modification. Otherwise we set the response flag.

    • Find current vote count: lines 13–18

      Find the candidate by ID in the map and fetch the vote count. If the candidate ID does not already exist in the map, set the count to 0.

    • Update count, if vote: lines 19–21

      If the candidate did not previously exist, this creates a new mapping; otherwise, it simply modifies an existing mapping.

    • Set vote count and return message: lines 22–23

Next we show how to implement a TCP voting client that connects over a TCP socket to the voting server, sends an inquiry followed by a vote, and then receives the inquiry and vote responses.

VoteClientTCP.java

 0 import java.io.OutputStream;
 1 import java.net.Socket;
 2
 3 public class VoteClientTCP {
 4
 5   public static final int CANDIDATEID = 888;
 6
 7   public static void main(String args[]) throws Exception {
 8
 9     if (args.length != 2) { // Test for correct # of args
10       throw new IllegalArgumentException("Parameter(s): <Server> <Port>");
11     }
12
13     String destAddr = args[0]; // Destination address
14     int destPort = Integer.parseInt(args[1]); // Destination port
15
16     Socket sock = new Socket(destAddr, destPort);
17     OutputStream out = sock.getOutputStream();
18
19     // Change Bin to Text for a different framing strategy
20     VoteMsgCoder coder = new VoteMsgBinCoder();
21     // Change Length to Delim for a different encoding strategy
22     Framer framer = new LengthFramer(sock.getInputStream());
23
24     // Create an inquiry request (2nd arg = true)
25     VoteMsg msg = new VoteMsg(false, true, CANDIDATEID, 0);
26     byte[] encodedMsg = coder.toWire(msg);
27
28     // Send request
29     System.out.println("Sending Inquiry (" + encodedMsg.length + " bytes): ");
30     System.out.println(msg);
31     framer.frameMsg(encodedMsg, out);
32
33     // Now send a vote
34     msg.setInquiry(false);
35     encodedMsg = coder.toWire(msg);
36     System.out.println("Sending Vote (" + encodedMsg.length + " bytes): ");
37     framer.frameMsg(encodedMsg, out);
38
39     // Receive inquiry response
40     encodedMsg = framer.nextMsg();
41     msg = coder.fromWire(encodedMsg);
42     System.out.println("Received Response (" + encodedMsg.length
43                + " bytes): ");
44     System.out.println(msg);
45
46     // Receive vote response
47     msg = coder.fromWire(framer.nextMsg());
48     System.out.println("Received Response (" + encodedMsg.length
49            + " bytes): ");
50     System.out.println(msg);
51
52     sock.close();
53   }
54 }

VoteClientTCP.java

  1. Process arguments: lines 9–14

  2. Create socket, get output stream: lines 16–17

  3. Create binary coder and length-based framer: lines 20–22

    We will encode/decode our vote messages using a coder. We elect to use a binary encoder for our protocol. Next, since TCP is a stream-based service, we need to provide our own framing. Here we use the LengthFramer, which prefixes each message with a length. Note that we could easily switch to using delimiter-based framing and/or text encoding simply by changing the concrete types of our VoteMsgCoder and Framer to VoteMsgTextCoder and DelimFramer, respectively.

  4. Create and send messages: lines 24–37

    Create, encode, frame and send an inquiry, followed by a vote message for the same candidate.

  5. Get and parse responses: lines 39–50

    The nextMsg() method returns the contents of the next encoded message, which we parse/decode via fromWire().

  6. Close socket: line 52

Next we demonstrate the TCP version of the vote server. Here the server repeatedly accepts a new client connection and uses the VoteService to generate responses to the client vote messages.

VoteServerTCP.java

 0 import java.io.IOException;
 1 import java.net.ServerSocket;
 2 import java.net.Socket;
 3
 4 public class VoteServerTCP {
 5
 6   public static void main(String args[]) throws Exception {
 7
 8   if (args.length != 1) { // Test for correct # of args
 9     throw new IllegalArgumentException("Parameter(s): <Port>");
10   }
11
12   int port = Integer.parseInt(args[0]); // Receiving Port
13
14   ServerSocket servSock = new ServerSocket(port);
15   // Change Bin to Text on both client and server for different encoding
16   VoteMsgCoder coder = new VoteMsgBinCoder();
17   VoteService service = new VoteService();
18
19   while (true) {
20     Socket clntSock = servSock.accept();
21     System.out.println('Handling client at ' + clntSock.getRemoteSocketAddress());
22     // Change Length to Delim for a different framing strategy
23     Framer framer = new LengthFramer(clntSock.getInputStream());
24     try {
25       byte[] req;
26       while ((req = framer.nextMsg()) != null) {
27         System.out.println('Received message (' + req.length + ' bytes)'),
28         VoteMsg responseMsg = service.handleRequest(coder.fromWire(req));
29         framer.frameMsg(coder.toWire(responseMsg), clntSock.getOutputStream());
30       }
31     } catch (IOException ioe) {
32       System.err.println('Error handling client: ' + ioe.getMessage());
33     } finally {
34       System.out.println('Closing connection'),
35       clntSock.close();
36     }
37    }
38   }
39  }

VoteServerTCP.java

  1. Establish coder and vote service for server: lines 15–17

  2. Repeatedly accept and handle client connections: lines 19–37

    • Accept new client, print address: lines 20–21

    • Create framer for client: line 23

    • Fetch and decode messages from client: lines 26–28

      Repeatedly request next message from framer until it returns null, indicating an end of messages.

    • Process message, send response: lines 28–29

      Pass the decoded message to the voting service for handling. Encode, frame, and send the returned response message.

The UDP voting client works very similarly to the TCP version. Note that for UDP we don’t need to use a framer because UDP maintains message boundaries for us. For UDP, we use the text encoding for our messages; however, this can be easily changed, as long as client and server agree.

VoteClientUDP.java

 0 import java.io.IOException;
 1 import java.net.DatagramPacket;
 2 import java.net.DatagramSocket;
 3 import java.net.InetAddress;
 4 import java.util.Arrays;
 5
 6 public class VoteClientUDP {
 7
 8   public static void main(String args[]) throws IOException {
 9
10     if (args.length != 3) { // Test for correct # of args
11       throw new IllegalArgumentException("Parameter(s): <Destination>" +
12                                                 " <Port> <Candidate#>");
13     }
14
15     InetAddress destAddr = InetAddress.getByName(args[0]); // Destination addr
16     int destPort = Integer.parseInt(args[1]); // Destination port
17     int candidate = Integer.parseInt(args[2]); // 0 <= candidate <= 1000 req"d
18
19     DatagramSocket sock = new DatagramSocket(); // UDP socket for sending
20     sock.connect(destAddr, destPort);
21
22     // Create a voting message (2nd param false = vote)
23     VoteMsg vote = new VoteMsg(false, false, candidate, 0);
24
25     // Change Text to Bin here for a different coding strategy
26     VoteMsgCoder coder = new VoteMsgTextCoder();
27
28     // Send request
29     byte[] encodedVote = coder.toWire(vote);
30     System.out.println("Sending Text-Encoded Request (" + encodedVote.length
31         + " bytes): ");
32     System.out.println(vote);
33     DatagramPacket message = new DatagramPacket(encodedVote, encodedVote.length);
34     sock.send(message);
35
36     // Receive response
37     message = new DatagramPacket(new byte[VoteMsgTextCoder.MAX_WIRE_LENGTH],
38         VoteMsgTextCoder.MAX_WIRE_LENGTH);
39     sock.receive(message);
40     encodedVote = Arrays.copyOfRange(message.getData(), 0, message.getLength());
41
42     System.out.println("Received Text-Encoded Response (" + encodedVote.length
43         + " bytes): ");
44     vote = coder.fromWire(encodedVote);
45     System.out.println(vote);
46   }
47 }

VoteClientUDP.java

  1. SetupDatagramSocketand connect: lines 10–20

    By calling connect(), we don’t have to 1) specify a remote address/port for each datagram we send and 2) test the source of any datagrams we receive.

  2. Create vote and coder: lines 22–26

    This time we use a text coder; however, we could easily change to a binary coder. Note that we don’t need a framer because UDP already preserves message boundaries for us as long as each send contains exactly one vote message.

  3. Send request to the server: lines 28–34

  4. Receive, decode, and print server response: lines 36–45

    When creating the DatagramPacket, we need to know the maximum message size to avoid datagram truncation. Of course, when we decode the datagram, we only use the actual bytes from the datagram so we use Arrays.copyOfRange() to copy the subsequence of the datagram backing array.

Finally, here is the UDP voting server, which, again, is very similar to the TCP version.

VoteServerUDP.java

 0 import java.io.IOException;
 1 import java.net.DatagramPacket;
 2 import java.net.DatagramSocket;
 3 import java.util.Arrays;
 4
 5 public class VoteServerUDP {
 6
 7   public static void main(String[] args) throws IOException {
 8
 9     if (args.length != 1) { // Test for correct # of args
10       throw new IllegalArgumentException("Parameter(s): <Port>");
11     }
12
13     int port = Integer.parseInt(args[0]); // Receiving Port
14
15     DatagramSocket sock = new DatagramSocket(port); // Receive socket
16
17     byte[] inBuffer = new byte[VoteMsgTextCoder.MAX_WIRE_LENGTH];
18     // Change Bin to Text for a different coding approach
19     VoteMsgCoder coder = new VoteMsgTextCoder();
20     VoteService service = new VoteService();
21
22     while (true) {
23       DatagramPacket packet = new DatagramPacket(inBuffer, inBuffer.length);
24       sock.receive(packet);
25       byte[] encodedMsg = Arrays.copyOfRange(packet.getData(), 0, packet.getLength());
26       System.out.println('Handling request from ' + packet.getSocketAddress() + ' ('
27           + encodedMsg.length + ' bytes)'),
28
29       try {
30         VoteMsg msg = coder.fromWire(encodedMsg);
31         msg = service.handleRequest(msg);
32         packet.setData(coder.toWire(msg));
33         System.out.println('Sending response (' + packet.getLength() + ' bytes):'),
34         System.out.println(msg);
35         sock.send(packet);
36       } catch (IOException ioe) {
37         System.err.println('Parse error in message: ' + ioe.getMessage());
38       }
39     }
40   }
41 }

VoteServerUDP.java

  1. Setup: lines 17–20

    Create reception buffer, coder, and vote service for server.

  2. Repeatedly accept and handle client vote messages: lines 22–39

    • Set upDatagramPacketto receive: line 23

      Reset the data area to the input buffer on each iteration.

    • Receive datagram, extract data: lines 24–25

      UDP does the framing for us!

    • Decode and handle request: lines 30–31

      The service returns a response to the message.

    • Encode and send response message: lines 32–35

Wrapping Up

We have seen how primitive types can be represented as sequences of bytes for transmission “on the wire.” We have also considered some of the subtleties of encoding text strings, and some basic methods of framing and parsing messages. We saw examples of both text-oriented and binary-encoded protocols.

It is probably worth reiterating something we said in the Preface: this chapter will by no means make you an expert! That takes a great deal of experience. But the code from this chapter can be used as a starting point for further explorations.

Exercises

1.

Positive integers larger than 231 1 (and less than 232 1) cannot be represented as ints in Java, yet they can be represented as 32-bit binary numbers. Write a method to write such an integer to a stream. It should take a long and an OutputStream as parameters.

2.

Extend the DelimFramer class to handle arbitrary multiple-byte delimiters. Be sure your implementation is efficient.

3.

Extend the DelimFramer to perform “byte stuffing,” so messages containing the delimiter can be transmitted. (See any decent networking text for the algorithm.)

4.

Assuming that all byte values are equally likely, what is the probability that a message consisting of random bits will pass the “magic test” in VoteMsgBin? Suppose an ASCII-encoded text message is sent to a program expecting a binary-encoded voteMsg. Which characters would enable the message to pass the “magic test” if they are the first in the message?

5.

The encodeIntBigEndian() method of BruteForceEncoding only works if several preconditions are met such as 0 ≤ size ≤ 8. Modify the method to test for these preconditions and throw an exception if any are violated.

 



[1] Java includes a class ByteOrder to denote these two possibilities. It has two static fields containing the (only) instances: ByteOrder.BIG_ENDIAN and ByteOrder.LITTLE_ENDIAN. Chapter 5 contains further details about this class.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset