6.7. Achieving the Big Bang

So, how does the Codexa's Big Bang Service scale to meet requirements that would bring most networked systems applications to their knees? Let's look at some of the issues in more detail.

6.7.1. Channel Neutrality

The human interfaces to the Codexa Service can include a graphical user interface, a standard Web browser, e-mail information distribution (for example, newsletters), and consumer devices, such as palm-tops, personal communication service (PCS)-based cellular phones and alphanumeric pagers. Such a channel-neutral system requires the flexibility to access the same business applications using a variety of front-end technologies. J2EE technology is a natural fit for this kind of architecture, because it is inherently layered. Channel-specific processes are handled at the Web or WAP server level, while core business logic and processes are executed in the application server and shared among channels.

To separate application and presentation logic, all Codexa GUIs adhere to the model view controller (MVC) pattern for data access and manipulation. The model is the XML-based data model residing on the application server. Depending on the delivery channel, the view is either through the Java Foundation Classes (JFC, commonly referred to as Swing), HTML, Wireless Markup Language, or raw XML (for a system-to-system interface). The controller always resides on Codexa's servers as servlets, EJBs, or the KnowledgeMQ. When a user request comes in, the system uses XSLT to transform the XML document via Exstensible Style Language to the appropriate delivery format—HTML, WAP, or XML.

6.7.2. Scalability

The Codexa Service must scale to support tremendous volumes of data and peak loads, because system response is most critical to users at the same times the system is most heavily used. At times of financial crisis, both client usage and incoming data levels peak. The Codexa Service can meet these demands because its structure, with a highly distributed implementation of the J2EE architecture and GemStone/J's multi-virtual machine configuration, allows the system to scale within platforms and across many platforms transparently.

Figure 6.7 shows Codexa's deployment architecture. Each of the systems depicted in the deployment architecture is designed to scale, based on load. Web services scale in the traditional Web-site fashion, through intelligently applied algorithms for distribution of requests. One of the common pitfalls of any balanced access to Web servers is that during a long transaction-oriented session, DNS must return the client to the same Web server. To overcome this obstacle, Codexa's production application server uses GemStone/J's distributed session tracking feature. In GemStone/J, client session state is automatically stored in the persistent object cache, so client requests can be assigned to different Web servers and servlet engines. When the request comes in, the new servlet engine retrieves the session state from PCA using the client ID, and it services the request, enabling true round-robin access for DNS.

Figure 6.7. The Codexa Service Deployment Architecture


To achieve optimal per-process performance, Codexa takes advantage of GemStone/J's multi-virtual machine architecture. Processing and resource utilization varies greatly among virtual machines in any truly distributed system, and Codexa's needs are no different. Each component within the Codexa Service leverages a VM configuration specific to its needs. For example, a traditional “client” VM that a user would leverage is throughput-intensive, yet not CPU intensive. A system CORBA VM, such as the one used by the knowledge filter would require greater CPU capabilities per system.

The Codexa Service makes the most of its hardware by running multiple, specially tuned virtual machines in each application server to handle different kinds of processes. GemStone/J virtual machines have a number of configuration options: optimum/maximum number of clients, Java heap size, lifespan before they are recycled, code to initialize a given virtual machine's resources, and so forth. Therefore, the Codexa Service tunes one or more virtual machines to the needs of each process type and the GemStone/J Activator assigns each incoming request to an appropriate virtual machine based on its configuration and actual current workload. The throughput-intensive “client” virtual machine is configured for a large client load with a larger memory allocation, while the CPU-intensive system CORBA virtual machine used by the knowledge filter is configured for fewer client processes within a single virtual machine. The system can activate more virtual machines on demand to handle possible spikes in throughput. Codexa typically runs many dozen virtual machines at a time on each application server.

The other major issue with scalability is database access. Codexa's data stores are huge—the system stores several gigabytes of new data every day, and a terabyte of disk can be dedicated to persistent object storage alone. To speed access, domain data is divided among several GemStone/J databases, one each for Wilshire 5,000 company data, securities information, and earnings information. The Data Services' RDBMS is Sybase ASE 12.0, which has a number of mechanisms in place to support high availability and fault tolerance.

Finally, in order to achieve scalability in the Java platform, processes must be able to run in a federated fashion in multiple virtual machines. The management of the federation of processes is the responsibility of the application server. Codexa leverages GemStone/J's PCA again to create a distributed object network that will enable not only federation of virtual machines but also horizontal scaling of the Application Services systems.

6.7.3. Security

Codexa requires air-tight security, because the search and reporting parameters its financial analyst clients set in the system are vital intellectual property that must not be visible to anyone else. Furthermore, Codexa has access to a number of on-line information providers for data used in its analyses. Some of this information can only be provided to clients in accordance with the clients' subscription agreements with the information provider. So, for example, Codexa may have clients who are authorized to see summary data, but others are permitted to view only some of the underlying details from various sources, which led to that aggregated data. Codexa keeps up-to-date security information about the clients who are authorized to see information from certain providers and enforces those agreements on the providers' behalf.

A combination of the Java Security Architecture (JSA), Java Cryptography Architecture (JCA), and Intel's Common Data Security Architecture (CDSA) enable Codexa to provide very high security for object access, within individual virtual machines and across its distributed system. The CDSA defines the core issues surrounding secure distributed systems: data and user authentication. The JSA addresses core issues surrounding security within a given virtual machine. The JCA addresses core issues surrounding public key infrastructure (PKI)-related technologies. The CDSA depicts a modular layered architecture (see Figure 6.8) that enables security infrastructure at any level.

Figure 6.8. Codexa's Modular, Layered Security Architecture


This component-based security architecture enables the extensibility that is necessary as the system's security constraints evolve. Each of the base modules can exist on a per-machine basis, as well as in a federated model. The per-machine basis of the model can be enabled through off-the-shelf implementations of common security services. A federated common security services manager (CSSM) enables a programmatic interface that supports the JSA, JCA, and JAAS.

Security for the Codexa Service deployment architecture (Figure 6.7) is maintained through three “zones of trust.” The first is the militarized zone of the Internet, which is protected through standard Internet security, such as SSL. The second is the demilitarized zone of Web services, which is protected by internal network partitioning and private address masking. The third is the production zone, where user authentication and authorization services control user access to methods and data.

6.7.4. Very High Availability

Financial institutions rely on the Codexa application to provide time-sensitive decision-support data whenever and wherever they demand it. The Codexa application, therefore, requires very high up-time. It must handle peak loads, and it must be able to incorporate improvements and upgrades, without taking the system offline. J2EE supports this because it is dynamic and layered, and Gemstone/J takes advantage of J2EE's architecture to provide precision failover of system resources and online deployment, upgrades, and maintenance.

6.7.5. Precision Failover

Often, availability problems occur at the lowest levels of a system's computing resources. Many availability solutions, however, must shut down the entire system to repair a minor problem. Codexa manages this issue by taking advantage of GemStone/J's Precision Failover technology, which monitors and handles recovery for critical software processes and reinitiates the process as necessary, without system down-time.

Perhaps most critical for Codexa is the GemStone/J “buddy system.” The active Global Name Service (GNS), a JNDI service, monitors the other processes running in the system, including the Activator, the persistent cache's Repository Name Service (RNS), and the PCA manager process. If any of these processes become unresponsive, the GNS will stop and restart it. The active GNS in the primary server also tracks the GemStone/J processes in secondary machines. A “buddy” process monitors the active GNS. If the GNS fails, the buddy will shut it down and initiate a restart. Since many of these are separate processes, a failure of a component should not interrupt the active requests in the system.

6.7.6. Transparent Client Session State Persistence

By storing client session state in the GemStone/J persistent cache, Codexa achieves increased availability, as well as scalability. If a virtual machine in the system fails, the Activator automatically reroutes client requests to another pooled virtual machine or starts a new virtual machine, if necessary. If the replacement virtual machine is on the same machine, it may be able to continue processing objects already in the shared object cache. If a servlet engine virtual machine fails, the Web-server adapter can reroute requests to another servlet engine virtual machine, which then retrieves session state from persistent object cache using the client ID.

6.7.7. Lifecycle Management and Availability

To maintain full-time availability, the Codexa application must be functional even during system expansion, reconfiguration and tuning, and application deployments and updates. GemStone/J enables Codexa system administrators to dynamically maintain and upgrade the application, without taking the Web site, application, application server, or hardware server offline. The application can be dynamically configured for scalability by adding, removing, and reconfiguring new virtual machines or other software components or by reconfiguring for performance tuning.

Comprehensive APIs within GemStone/J also give the Codexa application dynamic control of all parts of a system. These tools include both command-line controls and a graphical UI, and configuration changes can be saved automatically without restarting the system.

New versions of the Codexa application can be deployed with the same name while clients are still executing the prior version of the application. New clients are automatically given the new version of the application. This is enabled through the use of CORBA and virtual machine management.

6.7.8. Extensibility

Extensibility is a key requirement and capability of the Codexa Service, as evidenced by this quote from an early engineering specification: “At first glance, Codexa's Data Services may appear quite complex, with the interdependencies among components not immediately clear. The apparent complexity only increases when one realizes that Codexa's set of core technologies must support not only Codexa's requirements, but those of its clients.” Fortunately, the J2EE platform provides a set of standards that make extensibility both possible and practical.

The J2EE architecture includes JNDI, EJB, JSP, JMS, JTS, the Connector Architecture (which Codexa intends to embrace when it is finalized), CORBA, JDBC, XML, and Java Servlets technology. Through strict adherence to these standards and by exposing only the J2EE platform's standard APIs, Codexa has created a distributed information system infrastructure for which customers can write their own modules without having to learn proprietary APIs. And this applies to all third-party tools the system uses.

Going forward, Codexa will be able to apply vast amounts of working, production-tested components to the information synthesis needs of a broad array of financial and nonfinancial professionals. And as other divisions of Codexa appear to address new financial sectors, the systems that are developed will integrate seamlessly to share data with the currently running version of Codexa's Data Services. This paradigm affords huge savings in code reuse, and creates a significant win in the reusability and extensibility of the data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset