The purpose of Incident Management and Control is to establish processes to identify and analyze events, detect incidents, and determine an appropriate organizational response.
Throughout an organization’s operational environment, disruptions occur on a regular basis. They may occur as the result of intentional actions against the organization, such as a denial-of-service attack or the proliferation of a computer virus, or because of actions over which the organization has no control, such as a flood or earthquake. Disruptive events can be innocuous and go unnoticed by the organization or, at the other extreme, they can significantly impact operational capacities that affect the organization’s ability to carry out its goals and objectives.
To manage operational resilience, an organization must become adept at preventing disruptions whenever possible and ensuring continuity of operations when a disruption occurs. However, because not all disruptions can be prevented, the organization must have the capability to identify events that can affect its operations and to respond appropriately. This requires the organization to have processes to recognize potential disruptions, analyze them, and determine how (or if) and when to respond.
The Incident Management and Control process area focuses the organization’s attention on the life cycle of an incident—from event detection to analysis to response. The organization establishes the incident management plan and program and assigns appropriate resources. Event detection and reporting capabilities are established, and the organization sets criteria to establish when events become incidents that demand its attention. Events are triaged and analyzed, and incidents are validated. Supporting activities such as communication, logging and tracking events and incidents, and preserving event and incident evidence are defined and established. Most important, the organization performs post-incident review to determine what can be learned from incident management and applied to improve strategies for protecting and sustaining services and assets, as well as improvements in the incident management process and life cycle.
Incident management begins with event identification, triage, and analysis. An event can be one or more minor occurrences that affect organizational assets and have the potential to disrupt operations. An event may not require a formal response from the organization—it may be an isolated issue or problem that is immediately or imminently fixable and does not pose organizational harm. For example, a user may report opening an email attachment and then the user’s workstation does not operate properly. This “event” may be an isolated problem or an operator error that requires attention but may not require an organizational response.
Other events (or series of events) require the organization to take notice. Upon triage and analysis, these events may be declared as “incidents” by the organization. An incident is an event (or series of events) of higher magnitude that significantly affects organizational assets and associated services and requires the organization to respond in some way to prevent or limit organizational impact. For example, several customers may independently report that they are unable to place orders via the internet (events). The problem is deemed to be caused by a denial-of-service attack that is being targeted against the web portal (incident). In this case, the organization must be able to recognize, analyze, and manage the incident successfully. When an organization is dealing with an incident whose impact on the organization is rapidly escalating or immediate, the incident is deemed a “crisis.” A crisis requires immediate organizational action because the effect of the incident is already being felt by the organization and must be limited or contained.
Incidents affect the productivity of the organization’s assets and, in turn, associated services. Because assets span physical and electronic forms, incidents can be either cyber or physical in nature, depending on the target of the incident. Incidents that affect the people and facilities assets are typically physical in nature. In the case of information and technology assets, incidents can be cyber (such as unauthorized access to electronic information or to technology components) or physical (such as unauthorized access to paper or other media on which information assets are stored or to technology assets that are physically accessible).
Operational resilience is predicated on the organization’s ability to identify disruptive events, prevent them where possible, and respond to them when the organization is impacted. The extent to which the organization performs event management must be commensurate with the desired level of operational resilience that it needs to achieve its mission.
Incident management is a broad organizational function. It includes many types of activities that traverse the enterprise and require varying skill sets. To provide effective coverage of these activities, the Incident Management and Control process area has five goals that address
• incident planning and assignment of resources
• event and incident identification and reporting
• incident response and recovery
• incident learning and knowledge management
Developing, testing, and implementing service continuity plans are addressed in the Service Continuity process area.
Reporting incidents according to applicable laws, rules, and regulations is addressed in the Compliance process area.
The processes for identifying and detecting events that could become incidents are addressed in the Monitoring process area.
Managing risks to organizational assets that arise from incidents is addressed in the Risk Management process area.
The organizational process for identifying, analyzing, responding to, and learning from incidents is established.
Incident management is a risk management activity that is foundational to managing the security and resilience of an organization’s high-value assets and services. The organization must establish processes for identifying, analyzing, responding to, and learning from incidents to prevent the consequences of unanticipated risks and to manage these consequences when realized. The incident management process is also a source of knowledge that can be used by the organization to continually improve continuity plans and practices and strategies for protecting and sustaining services and assets.
The establishment of incident management and control processes begins with the planning for incident management and the identification and assignment of resources to carry out the plans.
Planning is performed for developing and implementing the organization’s incident management and control process.
Because each organization is unique, it must develop an incident management plan and process that fit its organizational and strategic drivers, business objectives, critical success factors, and general risk environment. These factors should determine the organization’s baseline philosophy regarding identification, analysis, and response to incidents and should be reflected in the organization’s plan for carrying out these activities. Specifically, the organization must plan for how it will
• identify events and incidents (e.g., through a service desk or problem management reporting activity, or through monitoring)
• analyze these events and incidents and determine an appropriate response
• respond to incidents (e.g., a local response or a coordinated enterprise response)
• structure and staff the plan (by assigning individuals or groups to specific roles or by creating a specialized incident response team such as a computer security incident response team [CSIRT] or similar group)
The organization should develop and document its plan for incident management and outline the specific objectives of the plan. The plan should reflect the organization’s philosophy of incident management and response. The objectives of the plan should be translated into specific actions and assigned to individuals or groups to be performed when necessary.
Typical work products
The incident management plan should address at a minimum
• the organization’s philosophy on incident management
• the structure of the incident management process
• the requirements and objectives of the incident management process relative to managing operational resilience
• a description of how the organization will identify incidents, analyze them, and respond to them
• the roles and responsibilities necessary to carry out the plan
• applicable training needs and requirements
• resources that will be required to meet the objectives of the plan (See IMC:SG1.SP2.)
• relevant costs and budgets associated with incident management activities
Documented commitments by those responsible for implementing and supporting the plan (particularly the commitment of higher-level managers) are essential for plan effectiveness.
Staff are identified and assigned to the incident management plan.
The incident management plan must be staffed to ensure the plan’s objectives can be carried out when necessary. The organization must identify the staff necessary to achieve the plan’s objectives and ensure that staff are assigned and aware of their roles and responsibilities with respect to satisfying these objectives. Staff should be provided sufficient autonomy and authority to carry out their duties as required by the plan. The organization must determine the types of training needed for those involved in the incident management process and provide training that is commensurate with incident management responsibilities and accountabilities.
Typical work products
Subpractices
Skill or staff gaps for each role and responsibility should be identified and resolved.
The training of staff who are vital to the management of operational resilience is covered in the Organizational Training and Awareness process area.
A process for detecting, reporting, triaging, and analyzing events is established and maintained.
Incidents originate as organizational events. The organization must be able to monitor and identify events as they occur, as well as to determine when an event or a series of events constitutes an incident that requires a coordinated and planned response. Failure to properly identify events in a timely manner can shift the organization’s resilience management burden from prevention to reactive management of organizational impact, which is much more costly.
In order to apply incident management processes, the organization must have a foundational structure for event detection, reporting, logging, and tracking, and for collecting and storing event evidence. Because incidents originate as one or more events, foundational processes related to event detection and reporting also support incident reporting, logging, and tracking.
Events are detected and reported.
The monitoring, identification, and reporting of events are the foundation for incident identification and commence the incident life cycle. Events potentially affect the productivity of organizational assets and, in turn, associated services. These events must be captured and analyzed so that the organization can determine whether an event will become (or has become) an incident that requires organizational action. The extent to which an organization can identify events improves its ability to manage and control incidents and their potential effects.
At a minimum, the organization should identify the most effective methods for event detection and provide a process for reporting events so that they can be triaged, analyzed, and addressed. Staff should be assigned to the task of monitoring various organizational processes (both technical and non-technical) to identify and report events. Typically the organization’s service desk is often the front line for collecting event data and for commencing the incident management process.
The processes that the organization uses to monitor the resilience of assets and to identify anomalies or problems (such as events) are addressed in the Monitoring process area.
Typical work products
Subpractices
Ensure that those assigned the responsibility for event detection and reporting understand this responsibility and have committed to performing it.
Events are logged and tracked from inception to disposition.
The organization should have a formal process for logging events as they are identified and for tracking them through the incident life cycle. Logging and tracking ensure that the event is properly progressing through the incident life cycle and, most important, is closed when an appropriate response and post-incident review have been completed. Logging and tracking facilitate event triage and analysis activities, provide the ability to quickly obtain a status of the event and the organization’s disposition, provide the basis for conversion from event to incident declaration, and may be useful in post-incident review processes when trending and root-cause analysis are performed. Logging and tracking may also support forensic activities and in some cases may be required by law enforcement. In essence, logging and tracking create an incident knowledgebase of both events and incidents to which the organization has been subjected. (Refer to IMC:SG5 for post-incident review practices.)
The organization must decide the degree to which the logging and tracking process is formalized. Logging and tracking should allow for the possibility that some events will go on to be declared as incidents, and as a result, additional information will be collected as the incident proceeds through incident handling and response activities. Basic information about events (and incidents) should include
• a unique organization-derived identifier (such as an event or incident number)
• a brief description of the event (type of event)
• an event category (based on categories predefined by the organization such as “denial of service,” “virus intrusion,” or “physical access violation”)
• the organizational assets, services, and organizational units that are affected by the event (including the seriousness of the organizational consequences)
• a brief description of how the event was identified and reported, and by whom, and other relevant details as necessary (application system, network segment, operating system, etc.)
• if the event was determined to be an incident (refer to IMC:SG2.SP4 and IMC:SG3), the individuals or teams to whom the incident was assigned for containment, analysis, and response (typically referred to as the “incident owner”)
• costs associated with the event or incident
• relevant dates (such as when the event or incident was detected or occurred, when the event or incident was closed, and, if applicable, when the post-incident review was performed)
• the actions taken in response to the event or incident
Subpractices
Guidelines and standards for the consistent documentation of events should be developed and communicated to all who are involved in the reporting and logging processes.
Refer to IMC:SG2.SP4 and IMC:SG3 for a description of events that are determined to be incidents.
The status of events should be checked regularly to ensure that they are moving through the organization’s established incident management process and are not stalled or awaiting activity. Events that need additional attention should be identified and resolved.
The process for collecting, documenting, and preserving event evidence is established and managed.
An event may become an organizational incident that has the potential to be a violation of local, state, or federal rules, laws, and regulations. This is often not known early in the investigation of an event, so the organization must be vigilant in ensuring that all event and incident evidence is handled properly in case an eventual legal issue, civil or criminal, is raised.
To properly collect, document, and preserve evidence, the organization must have processes for these activities, and the processes must be known to all staff who are involved in any aspect of the incident life cycle. Staff must be trained in proper identification and handling of evidence, ensuring that the integrity of the evidence is not altered. Because it is unpredictable whether an event or incident will result in legal action, an organization must also consider early involvement of legal and possibly law enforcement staff in the incident identification and analysis process to avoid problems with evidence retention, destruction, and tampering.
Subpractices
Because there may be compliance issues related to the collection and preservation of incident data, this practice must be considered in the context of the organization’s compliance program. (This is addressed in the Compliance process area.)
Rules, laws, regulations, and policies may require specific documentation for forensic purposes. These specific requirements must be included in the organization’s logging and tracking process as described in IMC:SG2.SP2. Some information about events may be confidential or sensitive, so the organization must be careful to appropriately limit access to event information to only those who need to know about it.
Events are analyzed and triaged to support event resolution and incident declaration.
The triage of event reports is an analysis activity that helps the organization to gather additional information for event resolution and to assist in incident declaration, handling, and response. Triage consists of categorizing, correlating, prioritizing, and analyzing events. Through triage, the organization determines the type and extent of an event (e.g., physical versus technical), whether the event correlates to other events (to determine if they are symptomatic of a larger issue, problem, or incident), and in what order events should be addressed or assigned for incident declaration, handling, and response. Triage also helps the organization to determine if the event needs to be escalated to other organizational or external staff (outside of the incident management staff) for additional analysis and resolution.
Some events will never proceed to incident declaration; the organization determines these events to be inconsequential. For events that the organization deems as low priority or of low impact or consequence, the triage process results in closure of the event and no further actions are performed.
Events that exit the triage process warranting additional attention may be referred to additional analysis processes for resolution or declared as an incident and subsequently referred to incident response processes for resolution. These events may be declared as incidents during triage, through further event analysis, through the application of incident declaration criteria, or during the development of response strategies, depending on the organization’s incident criteria, the nature and timing of the event(s), and the consequences of the event that the organization is currently experiencing or that is imminent. (Incident declaration and analysis are addressed in IMC:SG3.)
Typical work products
Subpractices
Events may be prioritized based on event knowledge, the results of categorization and correlation analysis, incident declaration criteria (refer to IMC:SG3.SP1), and experience with past declared incidents.
Possible dispositions for event reports include
• closed
• referred for further analysis
• referred to organizational unit or line of business for disposition
• declared as incident and referred to incident handling and response process
Events that have been declared as incidents as a result of the triage process should be appropriately designated in the incident knowledgebase.
Events that have not been closed or that do not have a disposition should be reprioritized and analyzed for resolution.
Incidents are declared and analyzed to support response planning.
Incident declaration defines the point at which the organization has established that an incident has occurred, is occurring, or is imminent and will have to be handled and responded to.
Transition from event detection to incident declaration can be immediate, particularly when it is clear to the organization that there are significant effects on organizational assets and associated services and a response is required to limit these effects and their impact. Thus, the extended time from event detection to incident declaration may be immediate, requiring little additional review and analysis. In other cases, incident declaration requires more thoughtful analysis; thus, the organization may need to use predefined criteria developed from experience to help guide incident declaration.
Once an incident has been declared, the organization must perform additional analysis to develop and implement an appropriate action plan for handling and response. This action plan may represent a routine activity (such as asking users to stop opening email messages containing greeting card announcements) or a specifically designed response that is unique to the incident and requires significant levels of organizational coordination and logistical support.
The development of the organization’s response to an incident is addressed in IMC:SG4.
Criteria for declaring incidents are defined and maintained.
Each organization has many unique factors that must be considered in determining when to declare an incident. Through experience, an organization may have a baseline set of events that define standard incidents, such as a virus outbreak, unauthorized access to a user account, or a denial-of-service attack. However, in reality, incident declaration may occur on an event-by-event basis.
To guide the organization in determining when to declare an incident (particularly if incident declaration is not immediately apparent), the organization must define incident declaration criteria.
Typical work products
Subpractices
Incidents are analyzed to support the development of an appropriate incident response.
Incident analysis is primarily focused on helping the organization to determine an appropriate response to a declared incident by examining its underlying causes and actions and the effects of the underlying event(s) that have already been detected by the organization. Analysis is performed to further understand the incident, to develop and implement action to contain its impact, and to recover from any resulting damage. Incident analysis may be informed by the correlation and prioritization activities performed in event triage.
Incident analysis requires skills from across the organization. Depending on the nature of the incident, analysis may involve asset owners, information technology staff, physical security staff, auditors, and legal staff, as well as external stakeholders such as vendors and suppliers, law enforcement staff, and vulnerability clearinghouses. Incident analysis may involve staff to whom the incident has been escalated or assigned (including the incident owner). (Incident escalation is addressed in IMC:SG4.SP1.)
Incident analysis should be focused on properly defining the underlying problem, condition, or issue and in helping the organization to prepare the most appropriate and timely response to the incident. It should also help the organization to determine whether the incident has legal ramifications. Analysis activities must feed the organization’s evidence collection process in case of future legal actions (see IMC:SG2.SP3) as well as the post-incident review processes (see IMC:SG5) for process improvement.
Typical work products
Subpractices
Provide appropriate levels of training for incident management staff on analysis tools and techniques.
Open event reports may correlate to the incident under analysis and provide additional information that is useful in developing an appropriate response. Reviewing documentation on previously declared incidents may inform the development of a response action plan, particularly if significant organizational (and external) coordination is required.
Ensure that analysis is appropriately documented on the incident analysis report and in the incident knowledgebase and made available for use in evidence collection, response development, and post-incident review.
The process for responding to and recovering from incidents is established.
The nature of a declared incident is that the organization has already incurred some effect, however limited, that requires the organization to act. Responding to and recovering from an incident often requires two primary actions from the organization:
• immediate limitation or containment of the scope and impact of the incident
• the development and implementation of an appropriate response to stop the ongoing or future effect of the incident, repairing any remaining damage, and restoring organizational assets and services to the state in which they existed prior to the disruption
Responding and recovering may also require a carefully coordinated and executed collaboration between organizational units and external entities (such as emergency providers) and a plan for handling incident logistics, particularly if the incident is significant or catastrophic. The logistics of these coordinating activities can often be planned in advance; however, execution may occur on demand or spontaneously.
In addition, to avoid reputation damage, the organization must also craft and implement a communications process that facilitates collaboration and logistical execution and keeps stakeholders aware of the incident’s evolution and resolution. (Refer to the Communications process area for more information about developing, deploying, and managing internal and external communications in support of a declared incident.)
Incidents are escalated to the appropriate stakeholders for input and resolution.
Incidents that the organization has declared and that require an organizational response must be escalated to those stakeholders that can implement, manage, and bring to closure an appropriate and timely solution. These stakeholders are typically internal to the organization (such as a standing incident response team or an incident-specific team) but could be external in the form of contractors or other suppliers. (Refer to the External Dependencies Management process area for information about managing relationships with external entities.) The organization must establish processes to ensure that incidents are referred to the appropriate stakeholders because failure to do so will impede the organization’s response and may increase the level to which the organization is impacted.
Because communication is a vital tool in incident escalation, the organization’s incident communications plan must be developed, implemented, and tested in order to support effective escalation (see IMC:SG4.SP3). (Communications activities in support of operational resilience, including when dealing with an incident, are described in the Communications process area.)
Typical work products
Subpractices
These criteria should provide guidance on when escalation is appropriate and necessary, and the level of escalation required.
Incident escalation procedures should consider the type and extent of the incident and the appropriate stakeholders.
Ensure that stakeholders such as the service desk are included in the escalation process.
A response to a declared incident is developed and implemented to prevent or limit organizational impact.
Responding to an organizational incident is often dependent on proper advance planning by the organization in establishing, defining, and staffing an incident management capability. In addition, the organization typically has service continuity plans that can be executed in parallel if an incident has resulted in drastically affected operations. (Developing, testing, and implementing service continuity plans are addressed in the Service Continuity process area.)
Responding to an incident describes the actions the organization takes to prevent or contain the impact of an incident on the organization while it is occurring or shortly after it has occurred. The range, scope, and breadth of the organizational response will vary widely depending on the nature of the incident. Incident response may be as simple as notifying users to avoid opening a specific type of email message or as complicated as having to implement service continuity plans that require relocation of services and operations to an off-site provider. The broad range of potential incidents requires the organization to have a broad range of capability in incident response.
The organization’s response to an incident must be founded on a well-structured incident response capability and plan (as developed in IMC:SG1.SP1). Depending on the organization, the actions related to incident response can include
• containing damage (i.e., by taking hardware or systems offline or by locking down a facility)
• collecting evidence (including logs and audit trails)
• interviewing relevant staff (those who are involved in reporting or analyzing the incident and those who are affected by it)
• communicating to stakeholders, including asset owners and incident owners
• developing and implementing corrective actions and controls
• implementing continuity and restoration plans or other emergency actions (See the Service Continuity process area for more information about continuity planning and response.)
The organization must consider the best response structure for its unique organizational structure and context. For some organizations, it makes sense to establish one or more permanent teams that are responsible for repeatable capabilities to respond to a broad range of incidents, supplementing the response with subject matter experts where necessary. In other cases, an organization may establish a virtual “team” of individuals who may be quickly called upon to perform specific duties to respond to an incident. In addition, the organization may have standardized responses for certain types of incidents (such as denial-of-service attacks) that have been developed through lessons learned. Some of these responses might be reflected in standard service continuity plans that the organization has already developed.
In responding to any incident, the organization must consider who is responsible for coordinating the overall response and ensure that those who must be involved in the response have been notified. Responders must update the incident knowledgebase to detail and document the steps taken to contain and repair incident damage so that future incidents can use this information in root-cause analysis and problem diagnosis. (See IMC:SG2.SP4 and IMC:SG3.SP2 for more details.) In addition, the organization must ensure that actions taken to contain or repair incident damage are performed in a way that ensures no additional vulnerabilities are introduced or that the effect on day-to-day operations is limited.
Typical work products
Subpractices
The incident response strategy and plan should address at a minimum
• the essential activities (administrative, technical, and physical) that are required to contain or limit damage and provide service continuity
• existing continuity of operations and restoration plans in the organization’s plan inventory
• the resources and skills required to perform the incident response strategy and plan
• coordination activities with other internal staff and external agencies that must be performed to implement the strategy
• the levels of authority and access needed by responders to carry out the strategy and plan
• objectives for measuring when the strategy and plan are successful
• the estimated cost of implementing the strategy and plan
• the essential activities necessary to restore services to normal operation (recovery), the resources involved in these activities, and their estimated cost
• legal and regulatory obligations that must be met by the strategy
• standardized responses for certain types of incidents
A plan for the communication of incidents to relevant stakeholders and a process for managing ongoing incident communications are established.
Miscommunications or inaccurate information about organizational incidents can have dire effects that far exceed the potential damage caused by an incident itself. As a result, the organization must proactively manage communications when incidents are detected and throughout their life cycle. This requires the organization to develop and implement a communications plan that can be readily implemented to manage communications to internal and external stakeholders on a regular basis and as needed. This plan should provide relevant information to these entities and control or limit the degree to which misinformation and conjecture can develop. It must also consider the needs of a wide range of stakeholders that have a vested interest in obtaining information about organizational incidents in a controlled and regular manner.
The basic structure of the plan may be static, but the plan should be flexible to address a broad range of incident types, stakeholders, and corresponding communications needs. In addition, the organization should consider developing partnerships with external stakeholders so that a coordinated communications strategy can be developed and implemented when incidents affect the organization’s external operational environment as well.
Typical work products
Subpractices
The incident communications plan should address at a minimum
• the stakeholders with which communications about incidents are required
• the types of media by which communications will be handled
• the various message types and level of communications appropriate to various stakeholders (For example, incident communications may be vastly different for incident responders than for those who may simply need to know.)
• special controls over communications (i.e., encryption or secured communications) that are appropriate for some stakeholders
• the roles and responsibilities necessary to carry out the plan
• the frequency and timing of communications
• internal and external resources that are involved in supporting the communications process
Ensure that these staff members have the appropriate level of training and skills necessary to execute and support the plan.
Incidents are closed after relevant actions have been taken by the organization.
(Closure of an incident can be performed only after post-incident review practices have been completed. Practices in IMC:SG5 must be completed before IMC:SG4.SP4 can be accomplished.)
Incident closure refers to the retirement of an incident that has been responded to (i.e., there are no further actions required and the organization is satisfied with the result) and for which the organization has performed a formal post-incident review (see IMC:SG5). The organization must have a process for formal closure of incidents (including the practices in IMC:SG5) which results in formally logging a status of “closed” in the incident knowledgebase.
A “closed” status indicates to all relevant stakeholders that no further actions are required or outstanding for the incident. It also provides notification to those affected by the incident that it has been addressed and that they should not be subject to continuing effects.
Typical work products
Subpractices
The criteria for incident closure will vary by organization but will generally occur after post-incident review has occurred. However, some organizations may establish concrete closure rules.
Typically, this action will be the responsibility of the incident owner or incident manager. Only authorized staff should be permitted to close an incident.
Incidents that appear to be open for an extended period of time may not have followed the organization’s incident management process or may not have been formally closed. The status of incidents in the incident database should be reviewed regularly to determine if open incidents should be closed or need additional action.
Lessons learned from identifying, analyzing, and responding to incidents are translated into actions to improve strategies for protecting and sustaining services and assets.
One of the most important aspects of incident management and control is the ability to understand why an incident occurred and what can be done by the organization to prevent it in the future. From a risk management standpoint, using incident lessons learned to improve controls and protection strategies and to optimize these strategies with continuity planning and response effectively shifts the organization’s attention from a response mode to a preventive mode.
Incident learning involves a post-incident review by relevant stakeholders, an active link to the organization’s problem management process, and a formal translation of lessons learned to improve strategies for protecting and sustaining services and assets.
The practices of this goal should be considered as part of the closure activity as described in IMC:SG4.SP4.
Post-incident review is performed to determine underlying causes.
Post-incident review is a formal part of the incident closure process. The organization conducts a formal examination of the causes of the incident and the ways in which the organization responded to it, as well as the administrative, technical, and physical control weaknesses that may have allowed the incident to occur.
To be effective, post-incident review requires the input of all relevant stakeholders in the incident management process. This includes those who
• reported the incident
• detected the incident
• triaged and analyzed the incident
• responded to the incident
• were affected by the incident
• had the incident communicated to them
Post-incident review should include a significant root-cause analysis process. The organization should employ commonly available techniques (such as cause-and-effect diagrams) to perform root-cause analysis as a means of potentially preventing future incidents of similar type and impact. (Root-cause analysis is also useful for linking to the organization’s problem management process, as detailed in IMC:SG5.SP2.) Considerations of other processes that may have caused or aided the incident should be given, particularly as they may exist in processes such as change management and configuration management.
Typical work products
Subpractices
These tools and techniques may include cause-and-effect diagrams, interrelationship diagrams, causal factor tree analysis, etc.
This report should detail the organization’s recommendations for improvement in administrative, technical, and physical controls, as well as improvements to the incident management process.
A link between incident handling and the organization’s problem management process is established.
Problem management is the process that an organization uses to identify recurring problems, examine root causes, and develop solutions for these problems to prevent future, similar incidents. There is a strong link between incident management and control and problem management in that incident management is often the process where symptoms of a larger problem are first presented. The organization’s problem management process and system (which are beyond the scope of CERT-RMM) need information from post-incident review to be effective; likewise, incident management may rely on information from problem management to diagnose and respond to incidents, particularly if no action has been taken to resolve identified problems or root causes. Formal linkages between problem management and incident management strengthen the organization’s overall ability to prevent incidents and minimize costly and reactive response activities.
Typical work products
Subpractices
The organization’s incident knowledgebase can serve as a central repository that links the incident management and problem management processes so that duplicative effort in documenting issues and problems can be avoided.
The lessons learned from incident management are analyzed and translated into service and asset protection and continuity strategies.
The costs associated with incident detection and response are an investment for the organization only to the extent that what is learned in these processes can be used by the organization to make it more efficient and effective in dealing with future events and in enhancing its approach to resilience. Lessons learned in incident management should serve as a benchmark for determining the validity and effectiveness of the organization’s current strategies for protecting and sustaining assets. In addition, lessons learned should provide valuable information for continuous improvement of the incident management process.
Typical work products
Subpractices
• protection strategies and controls for assets involved in the incident
• continuity plans and strategies for sustaining assets involved in the incident
• information security and other organizational policies that need to reflect new standards, procedures, and guidelines based on what is learned in the incident handling
• training for staff on information security, business continuity, and IT operations
Refer to the Generic Goals and Practices document in Appendix A for general guidance that applies to all process areas. This section provides elaborations relative to the application of the Generic Goals and Practices to the Incident Management and Control process area.
The operational resilience management system supports and enables achievement of the specific goals of the Incident Management and Control process area by transforming identifiable input work products to produce identifiable output work products.
Perform the specific practices of the Incident Management and Control process area to develop work products and provide services to achieve the specific goals of the process area.
Elaboration:
Specific practices IMC:SG1.SP1 through IMC:SG5.SP3 are performed to achieve the goals of the incident management and control process.
Incident management and control is institutionalized as a managed process.
Establish and maintain governance over the planning and performance of the incident management and control process.
Refer to the Enterprise Focus process area for more information about providing sponsorship and oversight to the incident management and control process.
Subpractices
Elaboration:
Elaboration:
Establish and maintain the plan for performing the incident management and control process.
Elaboration:
Specific practice IMC:SG1.SP1 requires the development of a plan for how the organization will carry out incident management and control. In generic practice IMC:GG2.GP2 as related to incident management, the planning elements required in IMC:SG1.SP1 are formalized and structured and are performed in a managed way. The plan for the incident management and control process should reflect the organization’s stated philosophy of incident management and the preferred means for handling incidents (i.e., through a dedicated or permanent team, a virtual team, etc.).
Subpractices
Provide adequate resources for performing the incident management and control process, developing the work products, and providing the services of the process.
Elaboration:
Specific practice IMC:SG1.SP2 requires the formal assignment of resources to the incident management and control process plan.
Subpractices
Elaboration:
Refer to the Organizational Training and Awareness process area for information about training staff for resilience roles and responsibilities.
Refer to the Human Resource Management process area for information about acquiring staff to fulfill roles and responsibilities.
Elaboration:
In the case of incident management and control, funding must extend to supporting the incident life cycle and consideration must be given to unknown funding requirements related to incident management that are relative to the type and extent of incident and the impact on the organization. Extending consideration to these unpredictable needs provides the organization a level of control over unplanned and potentially unconstrained costs.
Refer to the Financial Resource Management process area for information about budgeting for, funding, and accounting for incident management and control.
Elaboration:
Assign responsibility and authority for performing the incident management and control process, developing the work products, and providing the services of the process.
Elaboration:
Specific practice IMG:SG1.SP1 indicates that the incident management plan should define the roles and responsibilities necessary to carry out the plan, as well as document commitments from those responsible. Specific practice IMC:SG1.SP2 requires the assignment of staff to the incident management plan, as well as the identification of skill and staff gaps for each area of responsibility. Generic practice IMC:GG2.GP4 requires the assignment of responsibility for the activities in the incident management life cycle, including the identification of events and incidents, analysis of incidents, and incident response.
Refer to the Human Resource Management process area for more information about establishing resilience as a job responsibility, developing resilience performance goals and objectives, and measuring and assessing performance against these goals and objectives.
Subpractices
Elaboration:
Incident management and control activities may be temporal (i.e., involved with response to a specific incident) or permanent (involved with support activities that are not related to any specific incident).
To assign responsibility and authority for performing the incident management and control process, organizations may establish dedicated incident management teams that address the majority of incident handling and management activities or assign staff to virtual teams that come together when required. Other structures may also be implemented, such as decentralized dedicated teams, which would require varying levels of responsibility and authority to be assigned.
Elaboration:
Refer to the External Dependencies Management process area for additional details about managing relationships with external entities.
Train the people performing or supporting the incident management and control process as needed.
Refer to the Organizational Training and Awareness process area for more information about training the people performing or supporting the process.
Refer to the Human Resource Management process area for more information about inventorying skill sets, establishing a skill set baseline, identifying required skill sets, and measuring and addressing skill deficiencies.
Subpractices
Elaboration:
Elaboration:
Training can be obtained for all aspects of incident handling and for forming and managing formal incident teams. In addition, certification programs are available to certify incident handlers and for developing and participating on incident management teams.
Place designated work products of the incident management and control process under appropriate levels of control.
Elaboration:
Generic practice IMC:GG2.GP6 generically covers the recommended updating of the incident knowledgebase as described in several IMC-specific practices, as well as other work products of the incident management and control process.
The tools, techniques, and methods used to populate and maintain the incident knowledgebase should be employed to perform consistent and structured version control over the knowledgebase to ensure that incident information is current, accurate, and “official.” The tools, techniques, and methods can also be used to securely store the knowledgebase, provide access control over inquiry, modification, and deletion, and to track version changes and updates.
Identify and involve the relevant stakeholders of the incident management and control process as planned.
Elaboration:
Stakeholders of the incident management and control process may extend across the organization and externally to business partners and vendors. The identification of these stakeholders in generic practice IMC:GG2.GP7 is in addition to the identification of stakeholders of the incident communications process described in IMC:SG4.SP3, although it is recognized that these may be the same or similar.
Subpractices
Elaboration:
Monitor and control the incident management and control process against the plan for performing the process and take appropriate corrective action.
Refer to the Monitoring process area for more information about the collection, organization, and distribution of data that may be useful for monitoring and controlling processes.
Refer to the Measurement and Analysis process area for more information about establishing process metrics and measurement.
Refer to the Enterprise Focus process area for more information about providing process information to managers, identifying issues, and determining appropriate corrective actions.
Subpractices
Elaboration:
Elaboration:
Incident learning processes as described in IMC:SG5 are intended to provide a standard and consistent post-incident review and examination. However, reviews of the incident management and control process may result from periodic audits or examinations, particularly if metrics indicate a rise in incidents of specific types or with increasing impact, or an extension of time required to resolve incidents.
If the incident management and control process is decentralized (i.e., spread across organizational units), post-incident reviews may provide management insight into variations between the organizational units that could impact the organization’s overall incident management capability.
Objectively evaluate adherence of the incident management and control process against its process description, standards, and procedures, and address non-compliance.
Elaboration:
Review the activities, status, and results of the incident management and control process with higher-level managers and resolve issues.
Refer to the Enterprise Focus process area for more information about providing sponsorship and oversight to the operational resilience management system.
Incident management and control is institutionalized as a defined process.
Establish and maintain the description of a defined incident management and control process.
Elaboration:
Incident management and control may be performed in either a centralized or decentralized manner. The way in which the organization institutionalizes the incident management and control process varies based on the size of the organization, the diversity of operational environments, and other factors. This may lead to a range of implementation methods, including the use of a dedicated centralized team, dedicated virtual teams, decentralized dedicated teams, or other combinations.
Establishing and tailoring process assets, including standard processes, are addressed in the Organizational Process Definition process area.
Establishing process needs and objectives and selecting, improving, and deploying process assets, including standard processes, are addressed in the Organizational Process Focus process area.
Subpractices
Collect incident management and control work products, measures, measurement results, and improvement information derived from planning and performing the process to support future use and improvement of the organization’s processes and process assets.
Specific goal IMC:SG5 and its specific practices describe capturing lessons learned in post-incident review and translating these into improvements to incident management and control process activities. Such improvement directly supports the improvement of service continuity and strategies to protect and sustain assets and services.
Establishing the measurement repository and process asset library is addressed in the Organizational Process Definition process area. Updating the measurement repository and process asset library as part of process improvement and deployment is addressed in the Organizational Process Focus process area.
Subpractices