To Multicast or Not to Multicast

An increasing number of vendors are releasing products based on IP multicasting. To understand the tradeoffs involved in these products, you need a basic understanding of how the TCP/IP protocol family works, and how multicasting fits into the bigger picture.[2] We won't discuss any particular JMS implementations, or suggest that one vendor might be better than another; our goal is to give you the tools that you need to ask intelligent questions, evaluate different products, and map out a deployment strategy.

TCP/IP

TCP/IP is the name for a family of protocols that includes TCP (Transmission Control Protocol), UDP (User Datagram Protocol), and IP (Internet Protocol). The protocols are layered: IP provides low-level services; both TCP and UDP sit "on top of " IP.

TCP is a reliable, connection-oriented protocol. A process wishing to establish communication with one or more processes across a network creates a connection to each of the other processes and sends and receives data using those connections. The network software, rather than the application, is responsible for making sure that all the data arrives, and that it arrives in the correct order. It takes care of acknowledging that data has been received, automatically discards duplicate data, and performs many other services for the application. If something happens with the connection, the process on either side of the connection will know almost immediately that the connection has been permanently broken.[3]

Most high-level network protocols (and most JMS implementations) are built on top of TCP, for obvious reasons: it's a lot easier to use a protocol that takes care of reliability for you. However, reliability comes with a cost: a lot of work is involved in setting up and tearing down connections, and additional overhead is required to acknowledge data that's sent and received. Therefore, TCP is slower than its unreliable relative, UDP.

UDP

UDP (User Datagram Protocol) is an unreliable protocol: you send data to a destination, but there's no guarantee that the data will arrive. If it doesn't arrive, you'll never find out; furthermore, the process receiving the data will never know that you sent anything.

This sounds like a bad basis for reliable software, but it really only means that applications using UDP have to take reliability into their own hands: they need to come up with their own mechanism for verifying that data was received, and for retransmitting data that went astray. In practice, applications that need reliability guarantees can either use TCP, or can incorporate software to build reliability on top of UDP. Most applications have taken the easier route, but a few important applications (like DNS and the early versions of NFS) make extensive use of UDP.

IP Multicast

The simplicity of UDP makes possible a kind of service that's completely different from anything in the TCP world. Because it is connection-oriented, TCP is fundamentally limited to point-to-point communications. UDP offers the notion of a "multicast," in which an application can send data to a group of recipients. Multicasting is based on a special class of addresses, known as Class D addresses.[4] Class D addresses are not assigned to individual hosts; they're assigned to multicast groups. Hosts can join and leave groups that they have an interest in. Data sent to a multicast address will only be received by the hosts in the multicast group. At least from the network's standpoint, multicast is much more efficient when you need to send a message to many recipients.

Multicasting maps naturally into the sorts of things we want messaging systems to do. Many messaging products use multicasting for one-to-many pub/sub broadcast of messages. Most have built some level of reliability on top of UDP. If this issue is important to you, it would be in your interest to delve deeper and find out exactly what your JMS vendor has, or has not, implemented. Multicast has its drawbacks as well. UDP traffic is usually not allowed through a firewall, so you may have to negotiate with your network administrators or find some workaround if you need to get multicast traffic through your company's firewalls. Furthermore, multicast relies heavily on special routing software. Most modern routers support multicast, but lots of old routers are still in service. Even if you have up-to-date routers within your corporate network, and your network administrators know how to configure multicast routing, there's still the Internet; multicasting does not realistically work across the Internet (see the section Section 7.2.4.3 later in this chapter). As a configuration and maintenance consideration, multicast addresses must be coordinated across the network to avoid collisions. These drawbacks are especially important if you are building an application that you want to sell to others, who in turn expect to deploy it easily.

Messaging Over IP Multicast

In the following section we will explore the tradeoffs of using messaging over an IP multicast architecture. It is important for you to understand the issues as you map out your deployment strategy.

Duplication, ordering, and reliability of messages

If a messaging vendor wishes to provide full reliability for IP multicast and UDP it must build TCP-like semantics into the JMS provider layer to compensate for duplicate datagrams, out of order datagrams, and datagrams that could never possibly get to the intended destination. Either the JMS provider has to incur the overhead of detecting and compensating for duplicate datagrams, or the application needs to be tolerant of duplicate messages. If the duplication of datagrams is not dealt with at the JMS provider level, it is only really viable for DUPS_OK_ACKNOWLEDGE. No matter what, a messaging vendor has to implement the reliability necessary to ensure guaranteed ordering, since UDP doesn't ensure that packets are received in the same order that they are sent.

A messaging vendor should support some sort of error detection to know when a UDP datagram is lost. Ideally it should know that a client can't be reached due to a network boundary across an unsupported network router (see Section 7.2.4.3 later in this chapter). The JMS specification allows for a nondurable JMS subscriber to miss messages, but is intentionally vague about this since it is not a goal of the specification to impose an architecture on a JMS provider. However, for all practical purposes, nonguaranteed messaging means that messages may be lost, and that should mean they may only be lost once in a while. For both cases, some sort of acknowledgment semantics are required.

Centralized and decentralized architectures

A TCP-based messaging system generally uses a hub-and-spoke architecture whereby a centralized message server, or cluster of message servers, communicates with JMS clients using TCP/IP, SSL, or HTTP connections. The centralized server is responsible for knowing who is publishing and who is subscribing at any given time. Message servers may operate in a cluster spread across multiple machines, but to the clients there only appears to be a single logical server. Message servers operating in a cluster can intelligently route messages to other servers. Clustering may provide load balancing, and may help to optimize network traffic by selectively filtering and routing only the messages that need to get to a particular node. The servers are also responsible for persistence of guaranteed messages, and for Access Control Lists (ACLs) that grant permissions to subscribers on a per-topic basis. The messages are only delivered to the subscribers that are interested in a particular topic, and only to those that have the permissions to get them. A centralized server also makes it easier to add subscribers: when a new subscriber comes online; only the message server needs to know about it.

At the same time, a centralized architecture may introduce a single point of failure: if the main server in a cluster (the server to which clients initially connect) goes down, the entire cluster may become unavailable. A JMS provider may solve this problem by distributing the connections across multiple servers in the cluster. If one server goes down, the other servers can continue to operate, thus minimizing the impact of the failure. Reconnect logic may also be built into the client, enabling it to find another server if its initial server goes down.

Multicasting implies a drastically different architecture, in which there usually is no centralized server. Because there is no central server, there is no single point of failure; each JMS client broadcasts directly to all other JMS clients. One consequence of this architecture is that every publisher and every subscriber may have local configuration information about every other JMS client on the system. This can be an extremely important consideration for deployment administration. In the absence of a higher-level administrative framework, local configurations have to be updated on every client whenever a new client or a new topic is added.

A decentralized architecture may also mean that the persistence mechanism for guaranteed messaging is pushed out to the client machines. No matter how efficient the storage algorithm, disk I/O is always going to be the biggest bottleneck. Choosing to use such an architecture would require that the client machines have disk storage that is both fast and large.

There is disagreement as to whether guaranteed messaging (storing persistent messages) benefits from a decentralized architecture. Proponents of a decentralized architecture argue that the I/O load is distributed among the clients and is therefore faster. On the other hand, client I/O is not nearly as reliable, nor is it as fast as a centralized server with a powerful disk system.

Network routers and firewalls

Although technically possible, it is unlikely that a firewall administrator will allow UDP traffic to pass through a firewall. Firewalls typically disallow all traffic, except for traffic to or from specific hosts, using specific protocols. UDP traffic is rarely allowed through a firewall for various reasons.

In recognition of the problems with IP multicast (lack of support, and firewall blocking), messaging vendors that use IP multicast provide software bridge processes to carry messaging traffic across routers and firewalls. The bridges may consist of one or more processes connected together by HTTP, SSL, or TCP/IP.

If you're considering a vendor that supports multicasting, it is worth considering what percentage of your message traffic is going through one of these bridges. If all of your messages are going through the firewall over an SSL or HTTP connection, there will be little point in using multicasting behind the firewall for performance reasons. If the routers in your deployment environment require that a number of TCP/IP-based bridges be put in place, the performance benefits of multicast are diminished, depending on how many of these you have to put in place and administer. The messaging system is only as fast as its slowest link.

If most of the message traffic is confined to your corporate LAN or a VPN and you have full control over it, IP multicasting is a very attractive option.

Some vendors support both centralized and decentralized architectures

In recognition of these issues, the vendors who support IP multicast also provide centralized servers using TCP/IP socket connections. This could mean you have two different architectures to configure and support: one configuration for the nonguaranteed one-to-many pub/sub multicast of messages within a subnet on your corporate LAN, and another for everything else. It is important to consider what it will mean to choose one of these architectures at deployment time, or how you will switch from one mode to the other after your application is deployed.

The Bottom Line

IP multicast has significant network throughput benefits in a one-to-many broadcast of information. A single multicast message to multiple recipients will always cause less network traffic than sending the message to each recipient via a TCP connection. A messaging vendor picks and chooses how much reliability to build on top of UDP based on the quality of service required for the message as defined by JMS.

However, the choice is not that simple when it is applied to a deployment environment in a messaging product. The performance advantages of IP multicasting are only viable for a certain deployment environment. These advantages can diminish depending on the types of messages in your application, the networking hardware at your site, the deployment environment (intranet, extranet, internet), and the complexity of administration.

Make sure to benchmark your application carefully before making a final decision, using the guidelines we discussed earlier in this chapter. You may be surprised at what you see. When a JMS provider is put under heavy stress with lots of clients, there are so many other factors involved that the speed at which network packets go across the wire is not usually a significant factor. You may see that one vendor's implementation of messaging over IP multicast will perform vastly differently from another's—even with the use of nonguaranteed messaging. You may even find that one vendor's TCP-based implementation performs better than another vendor's multicast implementation.



[2] This is not the place for a comprehensive discussion of TCP/IP networking. If you want detailed treatment of these protocols, see Internet Core Protocols, by Eric Hall (O'Reilly). If you're interested in network programming in Java, see Java Network Programming, by Elliotte Rusty Harold (O'Reilly).

[3] If a connection is not sending or receiving any data, it could take a while before the owning process is signaled about a problem, depending on the network settings.

[4] A Class D network address is one defined as having the range of 224.0.0.0 through 239.255.255.255. Class D network addresses are reserved for IP multicast.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset