Chapter 8. Event-Driven Agility

Enterprises are not only driven by requests for services but also by events that occur both within the enterprise and in the business environment in which the enterprise functions. In this chapter, we focus on how events drive change in the enterprise. As depicted in Figure 8.1, the enterprise must recognize relevant events, analyze the threat or opportunity to determine a resolution, and implement appropriate changes to its business operations to maintain or improve its role in the ecosystem.

Figure 8.1. Event-Driven Agility.

Not all events drive change. There are many events that occur in the normal course of business, such as receipt of a customer order, a payment became overdue, a defect was detected, a machine failed, a shift ended. However, disruptive events are those that suggest that a change has occurred in the enterprise ecosystem that impacts the ability of the enterprise to achieve optimal value now or particularly in the future.

Some disruptive events, such as departure of a skilled employee, storm damage to a production facility, or a shortage of a critical production resource, have only a temporary or limited effect. These are part of the spectrum of disruptive events, but of greater concern are those disruptive events that signal the need for a permanent change in the operation of the enterprise, such as introduction of a new product by a competitor or a new government regulation.

There has been considerable industry attention and confusion related to event-driven architecture (EDA). For some, EDA means systems that process business transactions as they occur in contrast to batch processing. To others it means initiating business processes based on events. From a technology perspective, it may be viewed as a publish-and-subscribe integration model in which event notices are forwarded to applications or processes based on subscriptions rather than request-response exchanges or explicit addressing by an originator. To still others, it means sensing relevant business events and resolving associated challenges and opportunities.

We avoid reference to EDA and focus on this last perspective—sensing and responding to disruptive events, which can be resolved without information technology support, but in today's world, a timely response requires automation.

Response to disruptive events is essential to enterprise agility. Events drive enterprise actions to resolve changing business circumstances that may not otherwise be adequately recognized and resolved within mainstream operations. As the enterprise extends its business processes to respond to various exceptions, some events are anticipated in the normal operation of the business. For example, an enterprise normally anticipates that customers change their minds and unauthorized persons attempt to access protected resources. So some formerly disruptive events become routine events, resolved through well-defined processes. Disruptive events remain those that require some analysis and management planning or decision making. Management planning and decision making are the sources of transformation—adapting the enterprise to address business exceptions, challenges, and opportunities.

SOA can make an enterprise more flexible, accountable, and efficient, but it does not necessarily make it agile. The agile enterprise must be responsive to disruptive events. At the same time, an enterprise that is responsive to disruptive events is not agile if the responses are not timely and effective. SOA enables changes to be implemented more quickly and efficiently because (1) capabilities are consolidated and consistent, so changes can be more easily defined and deployed, (2) service units can change their capabilities with minimal impact on other service units, (3) automated business processes can be more quickly changed, and (4) the impact of changes on related products, services, and capabilities can be more easily understood and optimized from an enterprise perspective.

In this chapter we consider how disruptive events drive agility. In Chapter 9 we see how change is managed through governance.

Event Resolution Business Framework

Event-driven agility, in a sense, requires a second-order enterprise architecture because it requires an architecture for changing the enterprise architecture. Increasingly, management attention must turn from managing resources for performance (a control focus) to managing how the enterprise must change (an adaptation focus).

We begin with the premise that the enterprise is currently designed to operate in the current state of the world. The enterprise must adapt when there are relevant changes to the state of the world, including changes that occur within the enterprise itself—changes to the enterprise ecosystem.

A disruptive event may be a discrete change of state, such as a discovery, a new regulation, a natural disaster, a new product announcement, or a major sale. An event may also be the occurrence of a variance of a business variable outside a normal range or exceeding an expected rate of change. Such events may be based on business variables such as market price, inventory level, customer complaints, or economic indicators.

Figure 8.2 depicts an event resolution business framework—a management framework in which the enterprise should resolve issues raised by disruptive events. The framework assumes that an affected manager becomes aware of a disruptive event. Later we consider how managers can become aware of disruptive events.

Figure 8.2. Event Resolution Framework.

A manager at any level may become aware of a relevant event. If the event can be resolved at a lower level, within the scope of specific service units, the implementation may be delegated to those service units. If the resolution requires more extensive change, the event resolution may be escalated. The planning horizon is longer and the solutions more significant when it is necessary to resolve the event at higher levels. Management controls, business processes that initiate change, and service unit manager incentives must be appropriately applied to achieve a balance between local initiative and enterprise optimization.

By clarifying the allocation of responsibility for enterprise capabilities, SOA helps frame the responsibility for resolution of disruptive events. Transformation, when required, involves changing one or more service unit capabilities or the way service units are used. In the following subsections, we consider the roles of service unit manager, line-of-business manager, and executive staff within this framework.

Service Unit Manager

For a service unit manager, a disruptive event is one that either interferes with the efficient and responsive operation of the service unit or that may put the service unit at a competitive disadvantage—that is, it might not perform as well as similar service units in competing enterprises or as well as it has in the past.

Some events have an effect specifically on the operating activities of a particular service unit. Resolutions of these events can be implemented immediately by the service unit manager unless they require substantial investment or will adversely affect service cost, quality, or timeliness. In a service-oriented architecture, the approach to implementation of service operations is internal to the service unit, as long as it does not adversely affect service users or services used. If changes to internal operations affect the cost, quality, or timeliness of the service, the service unit manager might need to negotiate with service users to justify the change. Changes that do not affect the interface and do not increase cost or degrade the level of service should not be a concern to service users.

When faced with a disruptive event, the affected service unit manager must consider solutions and determine whether there is an operational solution, one that can be implemented internally. The operational solution could be, for example, to hire a new employee, reallocate or train personnel, change internal business processes, acquire new equipment or use another service. The service unit manager's responsibility is to optimize the operation of the service unit, so in general terms resolving disruptive events is his or her responsibility.

In some cases, there may be a need to change the service interface. Figure 8.3 depicts relationships between service users and service providers. Service unit X is both a user of service unit Y and a provider to service units A, B, and C. A change to the service interface of service unit X may require changes to all the users of service unit X, here represented by service units A, B, and C. If an event impacts the competitive position of service unit X, it affects the competitive position of service units A, B, and C, because A, B, and C must bear the cost and depend on the results of service unit X in the delivery of their services. Consequently, the solution should be the result of collaboration between the service unit manager of service unit X and the managers of the user service units A, B, and C. Consideration of changes must include adverse effects on service units up the request chain as well as affected products.

Figure 8.3. Change Propagation.

Note that enterprise governance should require that changes to service unit interfaces be approved at an enterprise level to ensure that the solution is optimal for the enterprise, particularly for future needs that might not be represented by current service users.

A service unit may require changes to a service it uses. In the diagram, service unit X may need changes to service unit Y. If so, the service unit X manager should work with the manager of service unit Y to develop the changes. However, service unit Y may have other users, such as service units M and N, that are not apparent to service unit X. Thus service unit X may need to engage both service unit Y and all the users of service unit Y to accomplish the change. The change should occur easily if all users see a net benefit; otherwise, they either agree that it has value to the enterprise or the decision must be made at a higher level in the organization. This includes consideration of the cost of change as well as any increase in operating cost as compared to the business value of the change.

If a change to a service provider increases the cost or degrades the performance of a user, that user is in turn accountable to its users. In the diagram, suppose service unit Y makes a change to improve the quality of its product, but this causes a cost increase. This cost increase is incurred by service unit X along with the other services that use service unit Y. This affects the obligation of service unit X to its users, A, B, and C. This effect propagates up the chain of users until it becomes evident in one or more value chains or otherwise affects enterprise performance. At that level, the impact on the enterprise and the ultimate customer can be evaluated.

Even if a service unit manager makes a change that reduces cost over time, there may be an investment, and thus an increase in costs, incurred in the short term. It would not be desirable for all improvements to be impeded by opposition to cost increases by users. This should be addressed with an appropriate funding mechanism. For example, the cost of change might be recovered over time, so improvements that would be recovered within a certain number of years would be authorized and amortized for cost recovery. The service unit management chain has primary responsibility for making such changes. Changes that would increase the unit cost of services to users should be approved by the service user managers and/or the affected line of business managers. The enterprise must establish appropriate procedures to ensure an appropriate level of budgeting, approval, and concurrence by affected managers.

For example, a machine repair service unit may determine that an investment in a diagnostic tool would reduce the cost of repairs. If the return on investment is acceptable, the cost of the tool can be prorated, reflecting the return on investment so that there is no net cost increase to service users. This would not affect the service unit interface. On the other hand, shifting from a failure-response mode of machine repair to a preventive maintenance mode requires a different relationship with service users and thus a change to the service interface—which may put an additional burden on service users while reducing the impact of failures on the operations of the service users. This also requires a change in the cost model. This can be resolved through collaboration with service users but should still require enterprise-level approval of the interface change.

For a particular event, there may be no solution that can be implemented within a single service or through a collaboration among the service manager and service users. This may be because the disruptive event has long-term consequences or significantly impacts the enterprise product or service. The resolution of these disruptive events must be escalated to the line-of-business manager or managers. In some cases the disruptive event may come to the attention of a service unit manager but not have a direct bearing on his or her operation. Notices of these events should be posted for distribution to more appropriate recipients. Posting should involve a formal mechanism for reporting event details and classification of the event for distribution. If appropriate recipients have not been predefined, the event notice must be escalated up the management chain.

Line-of-Business Manager

The line-of-business (LOB) manager has a broader perspective on needs for change. The LOB manager is concerned about the competitive development and delivery of the products or services he or she manages and thus can assess the implications of the disruptive event in a market context. All LOB managers should be able to view the delivery of customer value in the context of a product life cycle for their line of business.

As with the service unit manager discussed previously, the LOB manager may be able to work with one or more service units to resolve events of limited impact. These are essentially operational adjustments.

Events that indicate a change in market demand or an opportunity for competitive advantage should be primarily directed to LOB managers. They must translate a change in market demand to a change in sales forecasts, and then, using their value chain, they must determine the implications to the services used to deliver their product or service. This may have a significant impact on the workload of service providers, but it might not require any change in functionality.

Some events call for significant changes to the product or service or need to be coordinated across a number of services that are only indirectly related. For example, a new product technology may require changes in product engineering activities, production activities, field service activities, and supply chain relationships. The design and implementation of these changes requires cross-organizational coordination and control. Management of cross-enterprise change is discussed in detail in the next chapter on governance.

The cost of such changes must nevertheless be determined and considered in the decision to change. Change implementation may be owned by the LOB manager but managed and performed by transformation service units. The affected provider service unit managers have the primary responsibility for change implementation. If the change adversely affects their other users, the impact on those users is part of the cost of change and could be an increased burden on other product lines. Unless the affected product lines agree, the issue should be escalated to the executive staff level in the organization.

Value chain relationships in a service-oriented architecture make it possible to determine the full cost of change as well as the full cost of a product, including the indirect impact on related products and services. Each product is the result of contributions of value and cost from the services used to develop and deliver the product. Each service must report its true cost, including the cost incurred in using other services and the recovery of costs for improvements.

In some cases a disruptive event has effects that reach beyond the responsibility of the LOB manager. This may be an opportunity for a new line of business, a severe competitive disadvantage, a need for substantial realignment of business operations, a need for a merger or acquisition, or a technology change that exceeds a threshold for investment in new capabilities. These disruptive events should be escalated to the executive staff.

Executive Staff

The term executive staff refers to the enterprise's top management team and their staff that supports them in managing enterprise strategic planning, business design, and decision making.

The executive staff should be aware of any disruptive events that can cause significant and sustained change in operating costs, personnel, investment, and supply chain relationships. To ensure that operations are optimized at an enterprise level, the executive staff should be informed of transformation initiatives that require new capabilities or make significant changes to service unit capabilities or interfaces. This coordination and optimization of change is a management responsibility addressed in the next chapter.

Though the executive staff may be aware of the effects of disruptive events and the changes being undertaken, the implementation of many such changes can be delegated to the LOB managers or the individual service managers, as long as the solutions are not suboptimal from an enterprise perspective.

At the same time, the executive staff should be sensitive to patterns of disruptive events that suggest more fundamental or pervasive problems.

In Chapter 9 on governance, we examine in more detail the enterprise-level impact of disruptive events and the potential planning, decision-making, and enterprise transformation actions that may result.

Origins of Events

The enterprise must sense relevant changes in the enterprise ecosystem to drive appropriate changes in the enterprise. The enterprise cannot respond to events if it is not aware of them. Here we consider the origins of events, to provide perspective on the broad range of events that may be of interest and to stimulate thinking about how these events might be detected. Later we will consider analyzing and responding to events.

It should also be noted that event detection may still be sufficient if it does not capture the specific events of interest but instead captures events that suggest the likelihood that an event of interest has occurred or will occur. For example, a property and casualty insurance company can infer from reports of an approaching hurricane that it will be getting damage claims and may want to suspend issue of new policies in the area until after the storm has passed. A news report of a death or serious injury attributed to a product defect could suggest the occurrence of an engineering or production problem; it could also be an indicator of an impending sales slump.

Business Environment Events

Business environment events are the most difficult events to capture because they occur outside the control of the enterprise. Today, much relevant information exists on the World Wide Web. The occurrence of events of interest may be evident directly from certain Websites or Web services, but for other events it may be necessary to refer to Websites that reflect causal or consequential events. For example, an increase in the price of crude oil will be of interest as a causal event if the enterprise is affected by the consequential increase in fuel prices. News feeds may provide information on causal events such as a hurricane and consequential events, such as an increase in property damage claims or supplier shut-down, can be anticipated.

Though many events in the business environment are the root cause of changes, many may be precipitating events that have no immediate impact until the emergence of a new market that reflects changes in attitudes, applications, or synergy with other external changes. This is particularly true of new technology. The “invention” of the World Wide Web did not transform our view of the world for several years. Nevertheless, it is important to be aware of root-cause events and consider their strategic consequences so that the enterprise can be prepared for a timely response.

Some origins of events are highlighted here:

  • Customers. Customers are of interest with respect to the business they may bring to or take away from the enterprise. A change in a customer credit rating may represent an increased or decreased ability to buy product. A change in customer satisfaction may also suggest a likelihood of increased or decreased business. Customer satisfaction may require a periodic survey or personal contact to determine whether there has been a change.
  • Supply chain. The supply chain affects the capability of the enterprise to deliver value to customers. Changes in vendor product quality, price, or timeliness are of concern. A disruption or potential disruption of business operations of a supplier could stop or impair operation of the enterprise. Similarly, if the supply chain depends on transportation or communication carriers, disruptions of these services could stop or impair enterprise operations. In addition, changes in price or availability of raw materials used by suppliers or carriers could have a significant impact on the enterprise. For example, the price of crude oil and limitations on refinery capacity have resulted in significant increases in the cost of fuel, affecting transportation costs and indirectly affecting the costs of other products and services.
  • Economy. A wealth of economic data is available on the Web. The current values of these variables may be of interest, but what's important here are changes in these indicators. These include stock prices, consumer confidence, interest rates, balance of trade, unemployment, and currency values. It may not be necessary to monitor all such indicators but rather those that indicate when a significant change has occurred. Even if all such indicators were monitored, it is likely that further investigation would be required to identify the root cause and implications of a change.
  • Competitors. Actions by competitors that might gain competitive advantage are certainly events of interest. There are a wide variety of possible actions, but most reduce to new products or product improvements, pricing changes, marketing campaigns, joint ventures, or mergers and acquisitions. Most if not all of these are revealed in news releases. Though the news releases may be readily available, it could require human interpretation to determine the exact nature of the event.
  • Political. New regulations are an increasing concern for business, particularly in a world market with many political jurisdictions. Issues raised in political campaigns can influence the marketplace. Military conflicts, regime changes, and boycotts can affect markets, suppliers, and enterprise operations in affected countries.
  • Social. Fads can very quickly create new markets or shift market demand. Civil disturbances, particularly terrorist attacks or threats, can distract consumer attention and change patterns of behavior that can affect demand for products and services.
  • Nature. Natural disasters such as storms, droughts, floods, earthquakes, and volcanic eruptions can have significant effects on local markets, and they may affect the ability of the enterprise's local operations to function. The risk of spread of disease has been heightened by a shrinking world. Individuals can carry infections diseases around the globe, overnight. Concerns about a bird flu pandemic have faded, but such risks remain. A pandemic could have a major impact, not only on the marketplace, but on the ability of the enterprise to function.
  • Technical. Technical discoveries and inventions are the root cause of many changes in the way of doing business and in the products and services delivered. Patents should be noted as potential indications of initiatives by competitors that may result in competitive disadvantage to the enterprise. Scientific discoveries may take much longer to affect business operations and markets, so they probably affect strategic planning or research and advanced development activities. They may become tactical issues when they are reflected in announcements of new products, materials, methods, and tools.

Operational Events

Operational events are those that occur within the enterprise. Often these are consequential events resulting from an external event that has affected the operating capability or marketplace. Several categories of disruptive operational events are described briefly here:

  • Order volume. Significant changes in the volume or content of customer orders received should receive attention.
  • Customer delivery times. If times from receipt of an order until customer delivery change significantly, this can be cause for concern. Note that the average may remain stable while selected orders experience significant delays.
  • Service response time. An event may be triggered when the response time of individual, internal services varies beyond an accepted threshold or level of service commitments are violated.
  • Inventory levels. Inventory levels that are unacceptably high or low should be identified.
  • Defect rates. An increase in defect rates should be cause for concern. A decrease in defect rates may suggest an opportunity to sustain a lower rate.
  • Operating costs. Significant changes in operating costs of internal services or products should be reported.
  • Process variables. There may be other, process-specific variables that should trigger events if they vary beyond defined thresholds.
  • Profit margin. Profit margin is certainly a dependent variable. With an SOA, it should be possible to monitor profit more closely. Events might be generated when profits fall below an acceptable margin or when there is significant variance. Note that for some products or services, the configuration of a particular delivery may have a significant impact on profit for that delivery. For example, a product sales mix may have a significant impact on automobiles that have a high profit margin for a fully loaded model and minimal margin on a base model. Exceptional (high or low) profit on individual deliveries may be worthy of consideration.
  • Employee turnover. Employee turnover could put the reliable operation of the enterprise at risk. Changes in the rate of turnover should be monitored. This may require consideration of a variety of categories such as enterprise total separations, those for particular job or skill categories, and those for particular organizations or service units. The loss of a key employee should merit special attention.

Innovation Events

Innovation within the enterprise can create significant opportunities to improve profit or gain competitive advantage, but there is no benefit if the innovations are not given appropriate attention.

Filing a patent should be a clear indication of innovation that could benefit the enterprise. Likewise, though innovations that are incorporated in products may not go unnoticed, some product innovation opportunities do not get sufficient attention to make them into products.

Many innovations may occur within operating activities such as new methods or tools. If the benefit can only be realized by the particular operating activity, there may be no need for further attention. However, there could be side effects or external markets for some innovations, and these events should be escalated for attention with a broader perspective.

Enterprise Change Events

Improvements to services can be made within a service unit or in collaboration with related service units. This could result in suboptimal solutions if they do not receive tactical or strategic planning attention. Consequently, the initiation of efforts to develop changes may be significant events. Action on these events may be triggered by requests to authorize funding for such efforts. Similarly, investments in new tooling, training, or software may be events that should trigger consideration at a tactical or strategic level. Many enterprise change events can be identified in automated business processes, but there may be many others that should be posted by people as they recognize the emergence of problems or opportunities in current enterprise operations. Posting should formally capture the event information along with attributes that support appropriate distribution of notices.

Identification of Events of Interest

Obviously, the enterprise is not interested in every event happening anywhere—this would be overwhelming. It is essential that we determine what notifications are needed and how they can be captured.

Some event notices are duplications produced by different observers. Some represent different events resulting from the same root cause. Some event notices are simply ignored if the frequency becomes overwhelming. Consequently, we need to analyze what events are really needed and which event notices should be shared for different purposes. For example, an increase in the price of crude oil increases costs of materials and transportation and may result in a reduction in demand for certain consumer products or services. These consequences may be relevant to a number of different enterprise activities.

Relevant Events

There are two complementary approaches to analysis of event notice requirements: anticipated events and broken assumptions. Each service manager, LOB manager, and business executive should consider both approaches in the context of their scope of responsibility and planning horizon. The nature of all events and their potential areas of impact on the enterprise should be captured in a repository:

  • Anticipated events. Here the focus is on specific, identifiable events that require attention. Either these events are expected to occur and require attention, or they are unlikely to occur but the consequences are such that if they occur they require immediate attention.
  • Broken assumptions. The second approach is to identify business assumptions that are the basis for current operations but could become invalid. This might include assumptions like “there will be timely delivery of inventory replenishment,” “our operations can accommodate vacations and absences,” or “our operations will not be affected by a tornado.” The next step is to identify the events that could break these assumptions.

Some assumptions are shared by many managers and some are unique. Managers should have access to each others' assumptions so they don't duplicate effort. At the same time, they can each contribute their own perspectives to a body of business assumptions.

The events repository should include specification of the entities and the associated attributes and relationships that could change if the assumption were broken.

Risk Threshold

Though there are many events that could affect the success of the enterprise, it may still not be practical to capture and process all such events. Figure 8.4 depicts a process for assessment of the level of interest. Both the potential business impact and the cost of monitoring an event must be considered. Essentially, the cost of capture and analysis must be balanced against the risk to the business.

Figure 8.4. Level of Interest Assessment.

The determination of tolerance for risk weighs the business impact against the cost of responding quickly to the event. In most cases, the potential loss to be considered is in the additional time it takes to become aware of the event and respond if there is no active monitoring. In either case, if the event occurs, there will be some unavoidable consequences. If the probability of occurrence of the event is high and the consequences of delay are significant, this may justify early detection and response.

The event specification repository should include, for future reference, the assessment of impact and the estimated cost of monitoring. Note that it may still be appropriate to define a response to the event, even though it may be recognized through less formal means.

Sources of Event Notices

The agile enterprise needs to tap into many sources of event notices, both internal and external. Linking to event sources is an ongoing activity since sources of events change and new events will emerge as the ecosystem changes.

External Events

For external events, it may be very difficult to get event notices directly from the source. For example, competitors are not going to provide event notices that enable the enterprise to monitor their activities, or if they did, the events might not be a true representation of what is happening. So we need to look for consequential events—events that occur as a result of the root-cause event. For competitor events, we might look at product announcements or patent applications, which are much less likely to be misrepresented and more likely to be accessible.

There is a wealth of information on the Internet, but the holders of the information may not be prepared to generate event notices. Some might be willing to do so for a fee. Some data, such as stock trading prices, may be available as continuous updates, but it could be necessary to periodically poll various sources of data and watch for changes of state or significant trends.

Internal Events

Internal events are easier because the enterprise potentially has control over the sources. Some events can be generated by a business process, such as a patent application, a project approval, a budget overrun, an inventory shortage, or delayed orders. Other events require monitoring variables over time to identify trends. With internal systems, the sources are more reliable, but the mechanism for recognition of trends may be much the same as for monitoring trends on the Internet.

Some capabilities are already available in software products. Business activity monitoring (BAM) captures data from business processes for monitoring and analysis of exceptions and trends. Data warehouse systems and analytical processes are designed to support recognition of trends and correlation of events. Event notices might be automatically generated from some changes, but less obvious events require humans to realize insights and post event notices.

Business rules can be an important source of events, particularly exceptions. Violation of a business rule or the need to get extra approvals could be important for monitoring regulatory compliance as well as the need to modify business processes to resolve the exception in a more appropriate way.

Some event notices require employee initiative to identify a threat or opportunity and post an appropriate event. The enterprise must establish appropriate incentives.

Complex Event Processing

Complex event processing (CEP) is a technology for inferring events from other events and the surrounding circumstances. A CEP service is both a subscriber and publisher of events. For example, the National Association of Securities Dealers (NASD) monitors news feeds to analyze the relationship of company news events to stock trades, to identify potential insider trading and fraud.

Inference of an event relies on timely and accurate information on related events and circumstances. If the conclusion is to be based on the occurrence of two independent sources of event notices, there must be allowance for different delays in the delivery of the event notices. If the inference depends on related circumstances, there must be accurate and up-to-date information about those circumstances. The inferencing mechanism and event specifications must take these timeliness and accuracy factors into consideration. It is likely that the result of the inference can only be a probability that a particular event has occurred. It may be appropriate to publish an event notice only when the probability exceeds a particular threshold.

CEP technology is still evolving. An approach is to capture and retain a sequence of events for a period of time or a number of events for each event source or stream. Thus the sequences of events can be viewed as relational tables. An SQL-like query can join entries from multiple tables to find combinations of events that would suggest the occurrence of an underlying event of interest. This allows corresponding event notices to be considered together, even though they may have been received at different times. These queries can potentially include information about related circumstances. With special tools, queries can be implemented such that they are applied continuously as event notices are received.

At this point it would appear that CEP is primarily applicable to specific areas of concern such as fraud detection or security threats where there is a fairly focused domain of expertise and relevant events and the value derived from the inferred events is high. It essentially performs as a real-time expert system.

Internal systems are more controlled, and it may be effective to infer underlying events more directly, possibly adjusting for differences in timing. For example, warranty claims might be correlated with production events if the event notices are aligned based on the production date of the product. In this case, the receipt of warranty claims might trigger efforts to prevent or mitigate the consequences of similar production events. This, of course, requires a long history of production events.

Look-Back

Events may be captured and stored in a data warehouse; many enterprises already have such data warehouses for certain categories of events. Data mining is applied to data warehouse records to discover patterns and relationships that occur over time. The analysis is not in real time, so a data warehouse would not be considered an event publisher. Analysts could still submit discovery of certain trends or inferred events to a notification system, to be distributed to event resolution processes.

Emerging trends in CEP go beyond the inference of an event by the occurrence of a combination of related events. First, a broader spectrum of events can be captured and retained for future reference. When an event of concern occurs or is inferred, engaging in look-back at preceding events helps put the event of concern into context, to both understand the full nature of the event and discover potential causation.

By analogy, suppose a person is discovered murdered, but there is no source of information about events preceding the murder—no information about how the victim got where he is, no telephone calls, no witnesses, no information on where his acquaintances were at the time of the killing. In this situation it is very difficult to identify the perpetrator. These events are key to discovering the context of the murder and, so, the murderer.

Beyond the potential to look back, the precursor event patterns can be studied to discover patterns that might enable another unfortunate incident to be anticipated and prevented. The focus of attention can then shift to analysis of risk patterns to react earlier, to either prevent or respond more quickly to mitigate the effects of the undesirable event.

Verification and Consolidation of Event Notices

Besides inferring underlying events, there is a need to correlate events from multiple sources to either confirm the occurrence of an event or eliminate redundant reporting of an event.

Event notices from some sources may be unreliable. For example, event notices derived from news feeds may be the result of misinterpretation. There is often a need to confirm the event from another source. The speed of reporting may vary significantly, from minutes to days or longer. It may be appropriate to separately report such event notices, with an indication of the tentative nature of the notice (that is, having business metadata that reflect the quality of the event notice). The recipients of these notices would need to act accordingly.

Some events have many observers. These people may observe the event in different ways, characterize it differently, and publish event notices through different channels. It would not be desirable or appropriate to initiate an independent event resolution process for every redundant event notice. It may be necessary to leave the resolution of these redundancies to the subscribers. In some cases, the subscription criteria may limit reporting to certain event notices, reducing the number of redundant notices received. However, there is a risk that some events are not reported through all the possible channels, and when we ignore some notices, some events may be overlooked. Another approach is to notify event observers when resolution is identified (that is, a resolution event) so that they need not give further attention to the event notices they have received.

CEP systems may provide mechanisms for resolving these issues; however, the resolution may differ depending on the action to be initiated. The subscribers to such events should provide appropriate criteria for analysis and filtering of events.

Event Notification Infrastructure

The event notification infrastructure provides automated support for recognition, filtering, publication, and distribution of event notices. The Enterprise Intelligence service unit (see Chapter 9) is responsible for identifying the business requirements for capture of events and initiation of specific business processes, whereas the Information Technology organization is responsible for the technical infrastructure for event processing.

Surrogate Publishers

As noted earlier, many sources of events do not publish event notices. Instead, surrogate publishers are needed to determine when events have occurred and to publish notices. Three kinds of surrogates are described Here:

  • Polling. It is necessary to periodically poll these sources to monitor their state.
  • Data stream analysis. This form of surrogate publisher analyzes a data stream. Threshold values or patterns may be reported in different notices so that subscribers can selectively monitor different events in the same stream.
  • News analysis. News feeds can be analyzed to identify relevant events. The NASD news feed analysis, discussed earlier, identifies securities trades and related news events.

For each of these mechanisms, the nature of the change must be considered:

  • If an event is a simple state change, the publisher must keep track of the current state and send a notice whenever the state changes.
  • If an event is defined as a variance outside a specified limit, the publisher must compare the state variable to the threshold and send a notice when the threshold is exceeded. The publisher must still retain the new value of the variable so that it does not continue to send event notices because the threshold remains exceeded.
  • If an event is defined as crossing a rate-of-change threshold, the publisher must retain the state over a period of time and determine whether the threshold rate of change has been exceeded. Here, as in the case of variance outside a specified limit, event notices should be repeated only if a specified period of time has passed since the last notice was published.

The publication of event notices is further constrained by subscription specifications.

Publish-and-Subscribe Facility

The core of an event notification infrastructure is a publish-and-subscribe facility. This facility filters and delivers event notices. Typically, as depicted in Figure 8.5 an event broker receives event notices from publishers and forwards the notices to subscribers who have expressed interest in certain types of events. Publishers send event notices to the event broker. Events may be associated with topics and have attributes to describe the nature and context of the events. Subscribers may specify constraints on the event notices they want to receive. This may take the form of a topic designation and rules that filter events based on event notice attributes.

Figure 8.5. Brokered Notification.

Note that a subscriber may subscribe to multiple categories of events, a publisher may publish events of interest on multiple topics, and each event notice may be forwarded to multiple subscribers.

Publishers may publish event notices even though there are no subscribers, and subscriber constraints may filter out many of the event notices. There is a possibility that many event notices are generated for which there is no interest, resulting in unnecessary network activity.

There are a number of products that implement this event broker capability. Some of them implement the Java Messaging Service (JMS) specification from Java Community Process (JCP) that includes the publish-and-subscribe capability.

More recently, the Organization for Advancement of Structured Information Systems (OASIS) has adopted the WS-Notification family of specifications. Under WS-Notification, a subscriber can request notices directly from a publisher. WS-Notification does not preclude the use of a broker.

Figure 8.6 depicts a nonbrokered notification topology. An event directory identifies sources of events. Publishers need to register with the directory, and a subscriber then uses the directory to identify sources of events of interest and subscribes directly to those sources. A subscriber can define restrictions on the events of interest through specification of a constraint that operates on the attributes of the event notice.

Figure 8.6. Networked Notification.

This removes the event notification broker as a potential bottleneck. Before the general availability of Internet technology, a broker was necessary to eliminate a multitude of point-to-point connections; now with point-to-point connectivity and standard exchange protocols, a broker no longer simplifies the network. Note that Publisher C sends notices to Subscriber L and N, directly, whereas in the brokered notification, Publisher C would send a single notice and the broker would forward notices to Subscribers L and N. Subscription requests should be recoverable so that if the publisher fails or is shut down, notices will resume when the publisher returns to operation.

WS-Notification can enable publishers to avoid generating unwanted event notices—no subscribers, no notices. The burden is that each publisher must be able to turn the notification mechanism on or off.

The absence of a notification broker makes management of event notification totally decentralized. It may be preferable to at least use a notification broker between internal subscribers and external providers to monitor the activity and contractual compliance for purchased services. A broker also provides a central point of control for directing notices to subscribers so that publishers can be replaced when necessary without searching out all subscribers. As an alternative, the event directory could function as a broker of offers and requests, whereas the publishers each send event notices directly to subscribers.

Event Resolution Processes

A service-oriented architecture enables analysis of events to be decentralized, leveraging the local knowledge of service unit personnel, knowledge specific to lines of business or chief executive knowledge of the enterprise. Every manager receives events that could affect the service unit, line of business, or the executive staff activity he or she manages. Each service unit should have defined processes for responding to disruptive events.

Event resolution specifications should be reviewed from an enterprise perspective to ensure that they are directed to appropriate service unit(s). This analysis of events is different from an analysis of services or application requirements. The issue is, “when this event happens, what should be the result of the enterprise response?” The result should be appropriate for the enterprise, whereas if the event is only routed to a service unit that is directly affected, the result may be suboptimal for the enterprise. This analysis must be the basis for routing events for resolution.

A straightforward event resolution process receives an event and initiate activities to address the concerns. From a process-modeling perspective, an event notice can be viewed as a “request” for a process. We can take two alternative views. First, we define a continuously running process that is waiting to receive event notices. When an event occurs, the process determines what action to take and may invoke other processes to take that action. Alternatively, we can define a particular event as the start of a process. Different events may start the same process, or specialized processes can be initiated for different categories of events.

The processes that are initiated could be manual or automated. In general, we would expect that automated processes should be initiated even if the analysis, planning, decision-making, and response activities are manual. An automated process can at least provide assurance that the event notice does not fall through a crack.

These processes should be designed to deal with the various ambiguities, redundancies, and credibility of the event notices they receive. Generally speaking, the recipient of an event notice should perform some form of correlation and filtering of events to avoid unjustified or duplicated resolution activities.

It may be appropriate, given a certain magnitude of consequences and business risk, to initiate a resolution process immediately, even with an event notice of questionable credibility. However, such a process should be designed to take into consideration later event notices or the absence of supporting event notices at certain points in the subsequent activities. These considerations cannot be programmed into a CEP but must be part of the business logic of the resolution process.

Though each service unit and management chain has responsibility for resolving events that are relevant to their capabilities, there must be enterprise coordination to ensure that relevant events are recognized and responsibility for every event notice is defined. This is an appropriate responsibility for an Enterprise Intelligence.

In the next chapter we will examine the role of Enterprise Intelligence in the broader context of enterprise governance, and we will see how responses to disruptive events fit into the overall governance of the enterprise, to achieve agility and promote competitive advantage.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset