Chapter 18. Preparing and Deploying Server Clusters

Clustering technologies allow servers to be connected into multiple-server units called server clusters. Each computer connected in a server cluster is referred to as a node. Nodes work together, acting as a single unit, to provide high availability for business applications and other critical resources, such as Microsoft Internet Information Services (IIS), Microsoft SQL Server, or Microsoft Exchange Server. Clustering allows administrators to manage the cluster nodes as a single system rather than as individual systems. Clustering allows users to access cluster resources as a single system as well. In most cases, the user doesn't even know the resources are clustered.

Microsoft Windows Server 2003 supports three cluster technologies:

  • Network Load Balancing Network Load Balancing provides failover support for Internet Protocol (IP)–based applications and services that require high scalability and availability. By using Network Load Balancing, organizations can build groups of clustered computers to support load balancing of Transmission Control Protocol (TCP), User Datagram Protocol (UDP), and Generic Routing Encapsulation (GRE) traffic requests. Front-end Web servers are ideal candidates for Network Load Balancing.

  • Component Load Balancing Component Load Balancing provides dynamic load balancing of application components that use COM+. By using Component Load Balancing, COM+ components can be load balanced over multiple nodes to enhance the availability and scalability of software applications. Middle-tier application servers are ideal candidates for Component Load Balancing.

  • Server cluster Server cluster provides failover support for applications and services that require high availability, scalability, and reliability. By using server clustering, organizations can make applications and data available on multiple servers linked together in a cluster configuration. Back-end applications and services, such as those provided by database servers, are ideal candidates for Server cluster.

These cluster technologies are discussed in this chapter so that you can plan for and implement your organization's high-availability needs.

Introducing Server Clustering

A server cluster is a group of two or more servers functioning together to provide essential applications or services seamlessly to enterprise clients. The servers are physically connected together by a network and might share storage devices. Server clusters are designed to protect against application and service failure, which could be caused by application software or essential services becoming unavailable; system and hardware failure, which could be caused by problems with hardware components such as central processing units (CPUs), drives, memory, network adapters, and power supplies; and site failure, which could be caused by natural disaster, power outages, or connectivity outages.

You can use cluster technologies to increase overall availability while minimizing single points of failure and reducing costs by using industry-standard hardware and software. Each cluster technology has a specific purpose and is designed to meet different requirements. Network Load Balancing is designed to address bottlenecks caused by Web services. Component Load Balancing is designed to address the unique scalability and availability needs of Web-based applications. Server cluster is designed to maintain data integrity and provide failover support.

The clustering technologies can be and often are combined to architect a comprehensive service offering. The most common scenario in which all three solutions are combined is a commercial Web site where the site's Web servers use Network Load Balancing, application servers use Component Load Balancing, and back-end database servers use Server cluster.

Benefits and Limitations of Clustering

A server cluster provides high availability by making application software and data available on several servers linked together in a cluster configuration. If a server stops functioning, a failover process can automatically shift the workload of the failed server to another server in the cluster. The failover process is designed to ensure continuous availability for critical applications and data.

Although clusters can be designed to handle failure, they are not fault tolerant with regard to user data. The cluster by itself doesn't guard against loss of a user's work. Typically, the recovery of lost work is handled by the application software, meaning the application software must be designed to recover the user's work or it must be designed in such a way that the user session state can be maintained in the event of failure.

Clusters help to resolve the need for high availability, high reliability, and high scalability. High availability refers to the ability to provide user access to an application or a service a high percentage of scheduled times while attempting to reduce unscheduled outages. A cluster implementation is highly available if it meets the organization's scheduled uptime goals. Availability goals are achieved by reducing unplanned downtime and then working to improve total hours of operation for the related applications and services.

High reliability refers to the ability to reduce the frequency of system failure while attempting to provide fault tolerance in case of failure. A cluster implementation is highly reliable if it minimizes the number of single points of failure and reduces the risk that failure of a single component or system will result in the outage of all applications and services offered. Reliability goals are achieved by using redundant, fault-tolerant hardware components, application software, and systems.

High scalability refers to the ability to add resources and computers while attempting to improve performance. A cluster implementation is highly scalable if it can be scaled up and out. Individual systems can be scaled up by adding more resources such as CPUs, memory, and disks. The cluster implementation can be scaled out by adding more computers.

Tip

Design for availability

A well-designed cluster implementation uses redundant systems and components so that the failure of an individual server doesn't affect the availability of the related applications and services. Although a well-designed solution can guard against application failure, system failure, and site failure, cluster technologies do have limitations.

Cluster technologies depend on compatible applications and services to operate properly. The software must respond appropriately when failure occurs. Cluster technology cannot protect against failures caused by viruses, software corruption, or human error. To protect against these types of problems, organizations need solid data protection and recovery plans.

Cluster Organization

Clusters are organized in loosely coupled groups often referred to as farms or packs. A farm is a group of servers that run similar services but don't typically share data. They are called a farm because they handle whatever requests are passed out to them using identical copies of data that is stored locally. Because they use identical copies of data rather than sharing data, members of a farm operate autonomously and are also referred to as clones.

A pack is a group of servers that operate together and share partitioned data. They are called a pack because they work together to manage and maintain services. Because members of a pack share access to partitioned data, they have unique operations modes and usually access the shared data on disk drives to which all members of the pack are connected.

In most cases, Web and application services are organized as farms, while back-end databases and critical support services are organized as packs. Web servers running IIS and using Network Load Balancing are an example of a farm. In a Web farm, identical data is replicated to all servers in the farm and each server can handle any request that comes to it by using local copies of data. For example, you might have a group of five Web servers using Network Load Balancing, each with its own local copy of the Web site data.

Database servers running SQL Server and Server cluster with partitioned database views are an example of a pack. Here, members of the pack share access to the data and have a unique portion of data or logic that they handle rather than handling all data requests. For example, in a two-node SQL Server cluster, one database server might handle accounts that begin with the letters A through M and another database server might handle accounts that begin with the letters N through Z.

Servers that use clustering technologies are often organized using a three-tier structure. The tiers in the architecture are composed as follows:

  • Tier 1 includes the Web servers, which are also called front-end Web servers. Front-end Web servers typically use Network Load Balancing.

  • Tier 2 includes the application servers, which are often referred to as the middle-tier servers. Middle-tier servers typically use Component Load Balancing.

  • Tier 3 includes the database servers, file servers, and other critical support servers, which are often called back-end servers. Back-end servers typically use Server cluster.

As you set out to architect your cluster solution, you should try to organize servers according to the way they will be used and the applications they will be running. In most cases, Web servers, application servers, and database servers are all organized in different ways.

By using proper architecture, the servers in a particular tier can be scaled out or up as necessary to meet growing performance and throughput needs. When you are looking to scale out by adding servers to the cluster, the clustering technology and the server operating system used are both important:

  • All editions of Windows Server 2003 support up to 32-node Network Load Balancing clusters.

  • Enterprise Edition and Datacenter Edition support up to 8-node Component Load Balancing clusters.

  • Enterprise Edition and Datacenter Edition support Server cluster, allowing up to 8-node clusters.

When looking to scale up by adding CPUs and random access memory (RAM), the edition of the server operating system used is extremely important. In terms of both processor and memory capacity, Datacenter Edition is much more expandable. Standard Edition supports up to 4 processors and 4 gigabytes (GB) of RAM on 32-bit systems and 32 GB of RAM on 64-bit systems. Enterprise Edition supports up to 8 processors and 32 GB of RAM on 32-bit platforms and up to 64 GB of RAM on 64-bit platforms. Datacenter Edition supports up to 64 GB of RAM and 32 processors on 32-bit platforms and up to 512 GB of RAM and 128 processors on 64-bit platforms.

As you look at scalability requirements, keep in mind the real business needs of the organization. The goal should be to select the right edition of the Windows operating system to meet current and future needs. The number of servers needed depends on the anticipated server load as well as the size and types of requests the servers will handle. Processors and memory should be sized appropriately for the applications and services the servers will be running as well as the number of simultaneous user connections.

Cluster Operating Modes

For Network Load Balancing and Component Load Balancing, cluster nodes usually are identical copies of each other. Because of this, all members of the cluster can actively handle requests, and they can do so independently of each other. When members of a cluster share access to data, however, they have unique operating requirements, as is the case with Server cluster.

For Server cluster, nodes can be either active or passive. When a node is active, it is actively handling requests. When a node is passive, it is idle, on standby waiting for another node to fail. Multinode clusters can be configured by using different combinations of active and passive nodes.

When you are architecting multinode clusters, the decision as to whether nodes are configured as active or passive is extremely important. If an active node fails and there is a passive node available, applications and services running on the failed node can be transferred to the passive node. Because the passive node has no current workload, the server should be able to assume the workload of the other server without any problems (providing all servers have the same hardware configuration). If all servers in a cluster are active and a node fails, the applications and services running on the failed node can be transferred to another active node. Unlike a passive node, however, an active server already has a processing load and must be able to handle the additional processing load of the failed server. If the server isn't sized to handle multiple workloads, it can fail as well.

In a multinode configuration where there is one passive node for each active node, the servers could be configured so that under average workload they use about 50 percent of processor and memory resources. In the four-node configuration depicted in Figure 18-1, in which failover goes from one active node to a specific passive node, this could mean two active nodes (A1 and A2) and two passive nodes (P1 and P2) each with four processors and 4 GB of RAM. Here, node A1 fails over to node P1, and node A2 fails over to node P2 with the extra capacity used to handle peak workloads.

Clustering can be implemented in many ways; these are examples

Figure 18-1. Clustering can be implemented in many ways; these are examples

In a configuration in which there are more active nodes than passive nodes, the servers can be configured so that under average workload they use a proportional percentage of processor and memory resources. In the four-node configuration also depicted in Figure 18-1, in which nodes A, B, C, and D are configured as active and failover could go between nodes A and B or nodes C and D, this could mean configuring servers so that they use about 25 percent of processor and memory resources under an average workload. Here, node A could fail over to B (and vice versa) or node C could fail over to D (and vice versa). Because the servers must handle two workloads in case of a node failure, the processor and memory configuration would at least be doubled, so instead of using four processors and 4 GB of RAM, the servers would use eight processors and 8 GB of RAM.

When Server cluster has multiple active nodes, data must be shared between applications running on the clustered servers. In many cases, this is handled by using a shared-nothing database configuration. In a shared-nothing database configuration, the application is partitioned to access private database sections. This means that a particular node is configured with a specific view into the database that allows it to handle specific types of requests, such as account names that start with the letters A through F, and that it is the only node that can update the related section of the database (which eliminates the possibility of corruption from simultaneous writes by multiple nodes). Both Exchange Server 2003 and SQL Server 2000 support multiple active nodes and shared-nothing database configurations.

As you consider the impact of operating modes in the cluster architecture, you should look carefully at the business requirements and the expected server loads. By using Network Load Balancing and Cluster Load Balancing, all servers are active and the architecture is scaled out by adding more servers, which typically are configured identically to the existing Network Load Balancing and Cluster Load Balancing nodes. By using Server cluster, nodes can be either active or passive, and the configuration of nodes depends on the operating mode (active or passive) as well as how failover is configured. A server that is designated to handle failover must be sized to handle the workload of the failed server as well as the current workload (if any). Additionally, both average and peak workloads must be considered. Servers need additional capacity to handle peak loads.

Multisite Options for Clusters

Some large organizations build disaster recovery and increased availability into their infrastructure using multiple physical sites. Multisite architecture can be designed in many ways. In most cases, the architecture has a primary site and one or more remote sites. Figure 18-2 shows an example of a primary site and a remote site for a large commercial Web site.

Enterprise architecture for a large commercial Web site that has multiple physical locations

Figure 18-2. Enterprise architecture for a large commercial Web site that has multiple physical locations

As shown in Figure 18-2, the architecture at the remote site mirrors that of the primary site. The level of integration for multiple sites and the level at which components are mirrored between sites depends on the business requirements. With a full implementation, the complete infrastructure of the primary site could be re-created at remote sites. This allows for a remote site to operate independently or to handle the full load of the primary site if necessary. Here, the design should incorporate real-time replication and synchronization for databases and applications. Real-time replication ensures a consistent state for data and application services between sites. If real-time updates are not possible, databases and applications should be replicated and synchronized as rapidly as possible.

With a partial implementation, only essential components are installed at remote sites with the goal of handling overflow in peak periods, maintaining uptime on a limited basis in case the primary site fails, or providing limited services on an ad hoc basis. One technique is to replicate static content on Web sites and read-only data from databases. This would allow remote sites to handle requests for static content and other types of data that is infrequently changed. Users could browse sites and access account information, product catalogs, and other services. If they must access dynamic content or modify information (add, change, delete), the sites' geographical load balancers could redirect the users to the primary site.

Another partial implementation technique is to implement all layers of the infrastructure but with fewer redundancies in the architecture or to implement only core components, relying on the primary site to provide the full array of features. By using either technique, the design might need to incorporate near real-time replication and synchronization for databases and applications. This ensures a consistent state for data and application services.

A full or partial design could also use geographically dispersed clusters running Server cluster. Geographically dispersed clusters use virtual local area networks (VLANs) to connect storage area networks (SANs) over long distances. A VLAN connection with latency of 500 milliseconds or less ensures that cluster consistency can be maintained. If the VLAN latency is over 500 milliseconds, the cluster consistency cannot be easily maintained. Geographically dispersed clusters are also referred to as stretched clusters.

For geographically dispersed clusters, Windows Server 2003 supports a majority node set quorum resource. Majority node clustering changes the way the cluster quorum resource is used to allow cluster servers to be geographically separated while maintaining consistency in the event of node failure. In a standard cluster configuration, the quorum resource writes information on all cluster database changes to the recovery logs, ensuring that the cluster configuration and state data can be recovered. The quorum resource resides on the shared disk drives and can be used to verify whether other nodes in the cluster are functioning.

In a majority node cluster configuration, the quorum resource is configured as a majority node set resource. This allows the quorum data, which includes cluster configuration changes and state information, to be stored on the system disk of each node in the cluster. Because the data is localized even though the cluster is geographically dispersed, the cluster can be maintained in a consistent state. As the name implies, the majority of nodes must be available for this cluster configuration to operate normally. Should the cluster state become inconsistent, you can force the quorum to get a consistent state. An algorithm also runs on the cluster nodes to help ensure the cluster state.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset