Chapter 1. Introduction

The CERT Resilience Management Model (CERT-RMM) is the result of many years of research and development committed to helping organizations meet the challenge of managing operational risk and resilience in a complex world. It embodies the process management premise that “the quality of a system or product is highly influenced by the quality of the process used to develop and maintain it” by defining quality as the extent to which an organization controls its ability to operate in a mission-driven, complex risk environment [CMMI Product Team 2006].

CERT-RMM brings several innovative and advantageous concepts to the management of operational resilience:

• First, it seeks to holistically improve risk and resilience management through purposeful and practical convergence of the disciplines of security management, business continuity management, and aspects of IT operations management (the convergence advantage).

• Second, it elevates these disciplines to a process approach, which enables the application of process improvement innovations and provides a useful basis for metrics and measurement. It also provides a practical organizing and integrating framework for the vast array of practices in place in most organizations (the process advantage).

• Finally, it provides a foundation for process institutionalization and organizational process maturity—concepts that are important for sustaining any process but are absolutely critical for processes that operate in complex environments, typically during times of stress (the maturity advantage).

CERT-RMM v1.1 comprises 26 process areas that cover four areas of operational resilience management: Enterprise Management, Engineering, Operations, and Process Management. The practices contained in these process areas are codified from a management perspective; that is, the practices focus on the activities that an organization performs to actively direct, control, and manage operational resilience in an environment of uncertainty, complexity, and risk. For example, the model does not prescribe specifically how an organization should secure information; instead, it focuses on the equally important processes of identifying high-value information assets, making decisions about the levels needed to protect and sustain these assets, implementing strategies to achieve these levels, and maintaining these levels throughout the life cycle of the assets during stable times and, more important, during times of stress. In essence, the managerial focus supports the specific actions taken to secure information by making them more effective and more efficient.

1.1 The Influence of Process Improvement and Capability Maturity Models

Throughout its history, the Software Engineering Institute (SEI) has directed its research efforts toward helping organizations to develop and maintain quality products and services, primarily in the software and systems engineering and acquisition processes. Proven success in these disciplines has expanded opportunities to extend process improvement knowledge to other areas such as the quality of service delivery (as codified in the CMMI for Services model) and to cyber security and resilience management (CERT-RMM).

The SEI’s research in product and service quality reinforces three critical dimensions on which organizations typically focus: people, procedures and methods, and tools and equipment [CMMI Product Team 2006]. However, processes link these dimensions together and provide a conduit for achieving the organization’s mission and goals across all organizational levels. Figure 1.1 illustrates these three critical dimensions.

Figure 1.1. The Three Critical Dimensions

image

Traditionally, the disciplines concerned with managing operational risk have taken a technology-centric view of improvement. That is, of the three critical dimensions, organizations often look to technology—in the form of software-based tools and hardware—to fix security problems, to enable continuity, or even to improve IT operations and service delivery. Technology can be very effective in managing risk, but technology cannot always substitute for skilled people and resources, procedures and methods that define and connect tasks and activities, and processes to provide structure and stability toward the achievement of common objectives and goals. In our experience, organizations often ask for the one or two technological advances that will keep their data secure or improve the way they handle incidents, while failing to recognize that the lack of defined processes and process management diminishes their overall capability for managing operational resilience. Most organizations are already technology-savvy when it comes to security and continuity, but the way they manage these disciplines is immature. In fact, incidents such as security breaches often can be traced back to poorly designed and managed processes at the enterprise and operational levels, not technology failures. Consider the following: Your organization probably has numerous firewall devices deployed across its networks. But what kinds of traffic are these firewalls filtering? What rulesets are being used? Do these rulesets reflect management’s resilience objectives and the needs for protecting and sustaining the assets with firewalls? Who sets and manages the rulesets? Under whose direction? All of these questions typify the need to augment technology with process so that the technology supports and enforces strategic objectives.

In addition to being technology-focused, many organizations are practice-focused. They look for a representative set of practices to solve their unique operational resilience management challenges and end up with a complex array of practices sourced from many different bodies of knowledge. The effectiveness of these practices is measured by whether they are used or “sanctioned” by an industry or satisfy a compliance requirement instead of by how effective they are in helping the organization reduce exposure or improve predictability in managing impact. The practices are not the problem; organizations go wrong in assuming that practices alone will bring about a sustainable capability for managing resilience in a complex environment.

Further damage is done by practice-based assessments or evaluations. Simply verifying the existence of a practice sourced from a body of knowledge does not provide for an adequate characterization of the organization’s ability to sustain that practice over the long term, particularly when the risk environment changes or when disruption occurs. This can be done only by examining the degree to which the organization embeds the practice in its culture, is able and committed to performing the practice, can control the practice and ensure that the practice is effective through measurement and analysis, and can prove the practice is performed according to established procedures and processes. In short, practices are made better by the degree to which they have been institutionalized through processes.

1.2 The Evolution of CERT-RMM

The CERT Resilience Management Model is the result of an evolutionary development path that incorporates concepts from other CERT tools, techniques, methods, and activities.

In 1999, CERT officially released the Operationally Critical Threat, Asset, and Vulnerability Evaluation (OCTAVE) method for information security risk management. OCTAVE provided a new way to look at information security risk from an operational perspective and asserted that business people are in the best position to identify and analyze security risk. This effectively repositioned IT’s role in security risk assessment and placed the responsibility closer to the operations activity in the organization [Alberts 1999].

In October 2003, a group of 20 IT and security professionals from financial, IT, and security services, defense organizations, and the SEI met at the SEI to begin to build an executive-level community of practice for IT operations and security. The desired outcome for this Best in Class Security and Operations Roundtable (BIC-SORT) was to better capture and articulate the relevant bodies of knowledge that enable and accelerate IT operational and security process improvement. The bodies of knowledge identified included IT and information security governance, audit, risk management, IT operations, security, project management, and process management (including benchmarking), as depicted in Figure 1.2.

Figure 1.2. Bodies of Knowledge Related to Security Process Improvement

image

In Figure 1.2, the upper four capabilities (white text) include processes that provide oversight and top-level management. Governance and audit serve as enablers and accelerators. Risk management informs decisions and choices. Strategy serves as the explicit link to business drivers to ensure that value is being delivered. The lower four capabilities (black text) include processes that provide detailed management and execution in accordance with the policies, procedures, and guidelines established by higher-level management. We observed that these capabilities were all connected in high-performing IT operations and security organizations.

Workshop topics and results included defining what it means to be best in class, areas of pain and promise (potential solutions), how to use improvement frameworks and models in this domain, the applicability of Six Sigma, and emerging frameworks for enterprise security management (precursors of CERT-RMM) [Allen 2004].

In December 2004, CERT released a technical note entitled Managing for Enterprise Security that described security as a process reliant on many organizational capabilities. In essence, the security challenge was characterized as a business problem owned by everyone in the organization, not just IT [Caralli 2004]. This technical note also introduced operational resilience as the objective of security activities and began to describe the convergence between security management, business continuity management, and IT operations management as essential for managing operational risk.

In March 2005, CERT hosted a meeting with representatives of the Financial Services Technology Consortium (FSTC).1 At the time of this meeting, FSTC’s Business Continuity Standing Committee was actively organizing a project to explore the development of a reference model to measure and manage operational resilience capability. Although our approaches to operational resilience had different starting points (security versus business continuity), our efforts were clearly focused on solving the same problem: How can an organization predictably and systematically control operational resilience through activities such as security and business continuity?

In April 2006, CERT introduced the concept of a process improvement model for operational resilience in the technical report Sustaining Operational Resiliency: A Process Improvement Approach to Security Management [Caralli 2006]. This technical report defined fundamental resilience and process improvement concepts and detailed candidate focus areas (called “capability areas”) that could be included in an eventual model. This document was the foundation for developing the first instantiation of the model.

In May 2007, as a result of work with FSTC, CERT published an initial framework for managing operational resilience in the technical report Introducing the CERT Resiliency Engineering Framework: Improving the Security and Sustainability Processes [Caralli 2007]. In this document, the initial outline for a process improvement model for managing operational resilience was published.

In March 2008, a preview version of a process improvement model for managing operational resilience was released by CERT under the title CERT Resiliency Engineering Framework, v0.95R [REF Team 2008a]. This model included an articulation of 21 “capability areas” that described high-level processes and practices for managing operational resilience and, more significantly, provided an initial set of elaborated generic goals and practices that defined capability levels for each capability area.

In early 2009, the name of the model was changed to the CERT Resilience Management Model to reflect the managerial nature of the processes and to properly position the “engineering” aspects of the model. Common CMMI-related taxonomy was applied (including the use of the term process areas), and generic goals and practices were expanded with more specific elaborations in each process area. CERT began releasing CERT-RMM process areas individually in 2009, leading up to the “official” release of v1.0 of the model in a technical report published in 2010. The model continues to be available by process area at www.cert.org/resilience.

The publication of this book marks the official release of CERT-RMM v1.1. Version 1.1 includes minor changes to process areas resulting from field use and piloting of the model. In addition, version 1.1 introduces the concept of the operational resilience management system, which broadly defines the organization’s collective capability and mechanism for managing operational resilience. More about the operational resilience management system can be found in Section 2.2.

CERT-RMM

CERT-RMM draws upon and is influenced by many bodies of knowledge and models. Figure 1.3 illustrates these relationships. (See Tables 1.1 and 1.2 for details about the connections between CERT-RMM and CMMI models.)

Figure 1.3. CERT-RMM Influences

image

Table 1.1. Process Areas in CERT-RMM and CMMI Models

image

image

image

Table 1.2. Other Connections Between CERT-RMM and the CMMI Models

image

At the descriptive level of the model, the process areas in CERT-RMM have been either developed specifically for the model or sourced from existing CMMI models and modified to be used in the context of operational resilience management. CERT-RMM also draws upon concepts and codes of practice from other security, business continuity, and IT operations models, particularly at the typical work products and subpractices level. This allows users of these codes of practice to incorporate model-based process improvement without significantly altering their installed base of practices. The CERT Resiliency Engineering Framework: Code of Practice Crosswalk, Preview Version, v0.95R [REF Team 2008b] details the relationships between common codes of practice and the specific practices in the CERT-RMM process areas. The Crosswalk is periodically updated to incorporate new and updated codes of practice as necessary. The Crosswalk can be found at www.cert.org/resilience.

Familiarity with common codes of practice or CMMI models is not required to comprehend or use CERT-RMM. However, familiarity with these practices and models will aid in understanding and adoption.

As a descriptive model, CERT-RMM focuses at the process description level but doesn’t necessarily address how an organization would achieve the intent and purpose of the description through deployed practices. However, the subpractices contained in each CERT-RMM process area describe actions that an organization might take to implement a process, and these subpractices can be directly linked to one or more tactical practices used by the organization. Thus, the range of material in each CERT-RMM process area spans from highly descriptive processes to more prescriptive subpractices.

In terms of scope, CERT-RMM covers the activities required to establish, deliver, and manage operational resilience activities in order to ensure the resilience of services. A resilient service is one that can meet its mission whenever necessary, even under degraded circumstances. Services are broadly defined in CERT-RMM. At a simple level, a service is a helpful activity that brings about some intended result. People and technology can perform services; for example, people can deliver mail, and so can an email application. A service can also produce a tangible product.

From an organizational perspective, services can provide internal benefits (such as paying employees) or have an external focus (such as delivering newspapers). Any service in the organization that is of value to meeting the organization’s mission should be made resilient.

Services rely on assets to achieve their missions. In CERT-RMM, assets are limited to people, information, technology, and facilities. A service that produces a product may also rely on raw materials, but these assets are outside of the immediate scope of CERT-RMM. However, the use of CERT-RMM in a production environment is not precluded, since people, information, technology, and facilities are a critical part of delivering a product, and their operational resilience can be managed through the practices in CERT-RMM.

CERT-RMM does not cover the activities required to establish, deliver, and manage services. In other words, CERT-RMM does not address the development of a service from requirements or the establishment of a service management system. These activities are covered in the CMMI for Services model (CMMI-SVC) [CMMI Product Team 2009]. However, to the extent that the “management” of the service requires a strong resilience consideration, CERT-RMM can be used with CMMI-SVC to extend the definition of high-quality service delivery to include resilience as an attribute of quality.

CERT-RMM contains practices that cover enterprise management, resilience engineering, operations management, process management, and other supporting processes for ensuring active management of operational resilience. The “enterprise” orientation of CERT-RMM does not mean that it is an enterprise-focused model or that it must be adopted at an enterprise level; on the contrary, CERT-RMM is focused on the operations level of the organization, where services are typically executed. Enterprise aspects of CERT-RMM describe how horizontal functions of the organization, such as managing people, training, financial resource management, and risk management, affect operations. For example, if an organization is generally poor at risk management, the effects typically manifest at an operational level in poor risk identification, prioritization, and mitigation, misalignment with risk appetite and tolerances, and diminished service resilience.

CERT-RMM was developed to be scalable across various industries, regardless of their size. Every organization has an operational component and executes services that require a degree of operational resilience commensurate with achieving the mission. Although CERT-RMM was constructed in the financial services industry, it is already being piloted and used in other industrial sectors and government organizations, both large and small.

Finally, understanding the process improvement focus of CERT-RMM can be tricky. An example from software engineering is a useful place to start. In the CMMI for Development model (CMMI-DEV), the focus of improvement is software engineering activities performed by a “project” [CMMI Product Team 2006]. In CERT-RMM, the focus of improvement is operational resilience management activities to achieve service resilience as performed by an “organizational unit.” This concept can become quite recursive (but no less effective) if the “organizational unit” happens to be a unit of the organization that has primary responsibility for operational resilience management “services,” such as the information security department or a business continuity team. In this context, the operational resilience management activities are also the services of the organizational unit.

1.3 CERT-RMM and CMMI Models

CMMI v1.2 includes three integrated models: CMMI for Development, CMMI for Acquisition, and the newly released CMMI for Services. The CMMI Framework provides a common structure for CMMI models, training, and appraisal components. CMMI for Development and CMMI for Acquisition are early life-cycle models in that they address software and system processes through the implementation phase but do not specifically address these assets in operation. The CMMI for Services model addresses not only the development of services and a service management system but also the operational aspects of service delivery.

CERT-RMM is primarily an operations-focused model, but it reaches back into the development phase of the life cycle for assets such as software and systems to ensure consideration of early life-cycle quality requirements for protecting and sustaining these assets once they become operational. Like CMMI for Services, CERT-RMM also explicitly addresses developmental aspects of services and assets by promoting a requirements-driven, engineering-based approach to developing and implementing resilience strategies that become part of the “DNA” of these assets in an operational environment.

Because of the broad nature of CERT-RMM, emphasis on using CMMI model structural elements was prioritized over explicit consideration of integration with existing CMMI models. That is, while CERT-RMM could be seen as defining an “operations” constellation in CMMI, this was not an early objective of CERT-RMM research and development. Instead, the architects and developers of CERT-RMM focused on the core processes for managing operational resilience, integrating CMMI model elements to the extent possible. Thus, because the model structures are similar, CMMI users will be able to easily navigate CERT-RMM.

Table 1.1 provides a summary of the process area connections between CERT-RMM and the CMMI models. Table 1.2 summarizes other CMMI model and CERT-RMM similarities. Future versions of CERT-RMM will attempt to smooth out significant differences in the models and incorporate more CMMI elements where necessary.

1.4 Why CERT-RMM Is Not a Capability Maturity Model

The development of maturity models in the security, continuity, IT operations, and resilience space is increasing dramatically. This is not surprising, since models like CMMI have proven their ability to transform the way that organizations and industries work. Unfortunately, not all maturity models contain the rigor of models like CMMI, nor do they accurately deploy many of the maturity model constructs used successfully by CMMI. It is important to have some basic knowledge about the construction of maturity models in order to understand what differentiates CERT-RMM and why the differences ultimately matter.

In its simplest form, a maturity model is an organized way to convey a path of experience, wisdom, perfection, or acculturation. The subject of a maturity model can be an object or things, ways of doing something, characteristics of something, practices, or processes. For example, a simple maturity model could define a path of successively improved tools for doing math: using fingers, using an abacus, using an adding machine, using a slide rule, using a computer, or using a hand-held calculator. Thus, a hand-held calculator may be viewed as a more mature tool than a slide rule.

A capability maturity model (in the likeness of CMMI) is a much more complex instrument, with several distinguishing features. One of these features is that the maturity dimension in the model is a characterization of the maturity of processes. Thus, what is conveyed in a capability maturity model is the degree to which processes are institutionalized and the degree to which the organization demonstrates process maturity.

As you will learn in Chapter 5, these concepts correlate to the description of the “levels” in CMMI. For example, at the “defined” level, the characteristics of a defined process (governed, staffed with trained personnel, measured, etc.) are applied to a software or systems engineering process. Likewise for the “managed” level, where the characteristics of a managed process are applied to software or systems engineering processes. Unfortunately, many so-called maturity models that claim to be based on CMMI attempt to use CMMI maturity level descriptions yet do not have a process orientation.

Another feature of CMMI—as implied by its name—is that there are really two maturity dimensions in the model. The capability dimension describes the degree to which a process has been institutionalized. Institutionalized processes are more likely to be retained during times of stress. They apply to an individual process area, such as incident management and control. On the other hand, the maturity dimension is described in maturity levels, which define levels of organizational maturity that are achieved through raising the capability of a set of process areas in a manner prescribed by the model.

From the start, the focus in developing CERT-RMM was to describe operational resilience management from a process perspective, which would allow for the application of process improvement tools and techniques and provide a foundational platform for better and more sophisticated measurement methodologies and techniques. The ultimate goal in CERT-RMM is to ensure that operational resilience processes produce intended results (such as improved ability to manage incidents or an accurate asset inventory), and as the processes are improved, so are the results and the benefits to the organization. Because CERT-RMM is a process-focused model at its core, it was perfectly suited for the application of CMMI’s capability dimension. Thus, the model contained in this book constitutes a maturity model that has a capability dimension. However, this is not the same as a capability maturity model, since CERT-RMM does not yet provide an organizational expression of maturity. Describing organizational maturity for managing operational resilience by defining a prescriptive path through the model (i.e., by providing an order by which process areas should be addressed) requires additional study and research, and all indications from early model use, benchmarking, and piloting are that a capability maturity model for operational resilience management founded on CERT-RMM is achievable in the future.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset