Chapter 12
Cybersecurity Incident and Crisis Management

CLUSIF Club de la Sécurité de l'Information Français Gérôme Billois, CLUSIF Administrator and Board Member Cybersecurity at Wavestone Consultancy, France

The antivirus console administrator is phoning Maria, the chief information security officer (CISO) reporting to Tom the CEO: “… another virus has been detected. I know we struggle with many incidents like this every day, but this one seems very strange. I’ve never seen it before. It has infected the workstation of a researcher in the R&D lab and it is trying to send loads of data to Internet … the help desk manager just wants the workstation to be reinstalled as soon as possible, saying it’s a common incident and nothing to worry about. …”

Maria interjects: “No. This is now an incident needing our incident management process to kick in. Start sending the virus to our forensics experts, then …”

Cybersecurity Incident Management

One hundred percent protection capability does not exist in cybersecurity. A cybersecurity incident may always occur—whatever the level of investment. However, it is mandatory that the CEO ensure tailored-to-organization capabilities to differentiate low-impact routine cyber incidents from major crises that require prompt escalation to effective cyber crisis management in order to avoid high-impact interruption. This chapter shows the CEO how.

When a Cybersecurity Event Becomes an Incident

There are many definitions for a cybersecurity incident. Nearly every standard and framework (such as ISO 27001 and guidelines by the Institute of Risk Management [IRM] UK, the National Institute of Standards and Technology [NIST] and the European Union Agency for Network and Information Security [ENISA]) propose differing approaches. The main question is to define the specific criteria to apply to an event that has occurred that may or may not become a cybersecurity incident. These criteria typically represent the impact of the incident on confidentiality, integrity, availability, and traceability for organization assets. However, to stay only with that definition may result in being overwhelmed by a large number of incidents, especially if your organization tries to manage all the incidents related to availability.

A common filter to apply is to ask if the cause of the incident is related to a security breach. For instance, a server whose power supply fails because it is too old will not be classified as a cybersecurity incident, but a malicious administrator that accesses information must be. There are many debates as to whether to include in the criteria a suspicion or a vulnerability as an incident (such as those discovered during an audit). These are typically not considered as an incident but are registered as an anomaly or event. An incident is something that has direct and proven impacts.

Qualifying the Two Categories of Incident Sources

Cybersecurity incidents can be classified into two source categories (also known as root causes, risk sources, or inherent causes): internal or external incident sources.

Internal Incident Identification

Internal incident sources are the primary incident declaration channel by volume. Incidents are usually identified by the information technology (IT) teams such as the network, desktop, or IT surveillance teams, the users through the help desk, or even IT partners. After being analyzed by the IT teams, certain events may be flagged as cybersecurity incidents if the cause of the incident is related to information security (e.g., a breach of confidentiality or system unavailability due to malicious actions or data theft). To make this process operational, communicate a list of the different types of incident you want to track with examples. Start small and increase the list over the years. These technical incidents must be dealt within an appropriate incident management tool of the IT department in order to be efficient and to manage the large “industrial” scale of occurrences.

External Incident Identification

External incident sources are the secondary incident sources declaration channel by volume. They usually originate from coworkers, external partners, or law enforcement, which may contact the information security team to declare an incident. This is where you will probably encounter the most critical incident and probably need to internally store them in a separate tool to ensure confidentiality as the usual internal IT incident management tools are accessible by hundreds of people.

Qualifying Incidents

A structured and formal qualification process must be put in place to ensure that an identified incident will be managed with the appropriate level of attention. Several criteria need to be agreed and used to evaluate incidents. These should include:

  • Sensitivity of the data or processes concerned (e.g., research and development [R&D] and data dealt with by VIP’s, the Very Important People in the company such as Senior Management).
  • The functional perimeter (e.g., number of users or entities impacted).
  • The technical perimeter (e.g., number of workstations/servers impacted, partner’s systems).
  • The probable cause of the cybersecurity incident (e.g., malevolence, human error).

Following this qualification, the incident may be managed normally with predefined processes or it may trigger escalation to the crisis management process.

Follow the Incident Management Policy and Process Steps

The incident management process starts once an incident is discovered and qualified. It follows several steps: identification, containment, remediation, and recovery. All information must be recorded according to a cybersecurity incident management policy, approved at the required level (must be at least CISO and CIO; should be CEO and/or board) and communicated to all concerned parties. Other “must-have” requirements are listed in Table 12.1.

Table 12.1 Cybersecurity Incident Must-Have Checklist

Requirements Suggested Content
Cybersecurity incident management policy—includes event and incident definition Adapted to organization context and explaining the difference between an event, an alert, an anomaly and an incident
Event and incident impact qualification matrix A matrix with the different criteria to assess the event, decide if it is an incident and evaluate its criticality
Detailed processes Roles and responsibilities on identification, containment, remediation, recovery and reporting (e.g., using a responsible, accountable, consulted, and informed [RACI] matrix); covering sources whether internal or external (with partners/law enforcement)
Incident response methodologies “How to” on the most common security incidents (such as viruses, phishing, denial of service)
Incident management reporting At entity and global level, linked with the ERM tool/applications
Incident repository and follow-up tools Either through a specific tool/file or within the IT and/or ERM tool/applications

Integrating Incident Reporting with Enterprise-wide Risk Management (ERM)

To report properly on cybersecurity incidents, you need to create a global repository of such information that will be fed by both IT internal and external sources. Data fed from IT internal sources is often automated due to the number of events and the number of people reporting the data. The information security correspondent network is often in charge of declaring the incidents in a centralized tool within large organizations.

Be warned that it is often difficult to automatically consolidate incidents between organization entities because a single incident may have impacted several entities or be declared/recorded separately with different names and dates. Once consolidated, these incidents may be summarized and imported in the incident repository coordinated by IT collaborating with the ERM function and their ERM umbrella processes. The reporting has to be ultimately presented to the top management of the organization to report threats and the effectiveness and efficiency of the cybersecurity measures in place.

Cybersecurity Crisis Management

A few days later, CEO Tom briefed his board, having received a combined briefing from CISO Maria and chief risk officer (CRO), Nathan, saying, “I’m here to update you on a cyber incident that, unfortunately, escalated into a crisis we had to manage. A cyber attack on our R&D function was detected that infected 30 percent of the R&D lab computers. The attackers were trying to steal our new product intellectual property. We successfully triggered the crisis management process and were able to cut off the attackers before too much was stolen. Due to that swift and efficient response, no communication was required to our stakeholders and regulators, and the financial impacts are limited.”

Going from Incident to Crisis Management

We have described so far how to manage standard security incidents. However, the crisis management process needs to be triggered by specific circumstances where the usual processes are unable to cope (such as large or multiple incidents occurring simultaneously).

Crisis Management Operating Principles

Cyber crisis management (CCM) is aligned with, and a subpart of, enterprise business continuity management. (For more on business continuity, see Chapter 13.) CCM aims to implement a set of specific organizational and technical measures to allow specially mobilized staff to deploy quickly, effectively, and efficiently during the crisis and respond to potentially unknown situations. CCM ultimately aims to contain impacts and resolve the crisis as quickly as possible.

CCM typically depends on a crisis decision-making unit (CDU) made up of representatives of the organization’s top management (e.g., executive committee, board of directors, CRO). This steering role by top management is necessary in order to:

  • Mobilize adequate resources urgently and set priorities.
  • Allow operations outside of usual processes.
  • Quickly validate measures that could impact business processes.
  • Manage external communications and crisis disclosure (if required by regulators/laws, if the crisis is directly visible by the general public or if it has been leaked to the press).
  • Maintain business continuity to the fullest extent possible in the face of a cyber incident. (See Chapter 13 for a complete discussion of business continuity management.)

The CDU is supported by one or more operational crisis team units who are preincident trained to carry out the CDU’s orders and keep the CDU informed of developments. These units typically include:

  • A human resources unit covering internal communication and contact with staff.
  • A corporate communications/public relations unit that prepares the various communications and manages interaction with the media and external stakeholders.
  • A legal unit or representative to log and process filed complaints and notify various external parties.
  • A risk function member to coordinate all functions.

Crisis management mechanisms must be documented and tested regularly prior to any crisis. Several aspects need to be covered. These include:

  • Human resource aspects such as identification of key people, decision-making mechanisms, and team rotation.
  • Logistics such as dedicated workspaces, crisis directory, standby telephones, catering.
  • Technical aspects such as defense and investigation capabilities, tools, and so on.

Such mechanisms do not exist today in full in most organizations (except some of the larger ones and in some sectors). These mechanisms are, however, a prerequisite to correctly manage a cybersecurity crisis and are increasingly asked after by boards and external stakeholders such as regulators, credit rating agencies, and insurers.

Structuring and Mobilizing an Operational Cybersecurity Crisis Unit

In the event of a crisis stemming from a cyber attack on the information system, an operational unit needs to be deployed, either as part of a usual information system operational unit or separately. Practical experience over recent years has shown that three teams need to be trained within this unit.

The Investigation Team

The investigation team’s objective is to identify when the attack started, the vulnerabilities exploited, and consequences of the attack (such as stolen documents or corrupted systems). It analyzes all available internal and external technical elements. It tries to identify the attack’s source and the extent of the information system’s compromise. The team is made up of digital investigation and forensics specialists focused on reacting quickly to information system crises. Its specialists are often externally sourced and embedded from companies that have a computer security incident response team (CSIRT) or a computer emergency response team (CERT). The targeted organization’s technical experts are also integrated into the team to provide an understanding of the context.

The Defense Team

The defense team prepares all the technical actions for repelling the attacker and correcting the vulnerabilities exploited during the attack. Its work often goes beyond the acute phase of the crisis in order to consolidate and correct the attacked system in depth and over time. It includes internal specialists with knowledge of the organization’s tools and systems combined with external experts with knowledge of the attacker’s methods to prevent against any rebound attacks or secondary infections.

The Steering Team

The steering team creates the link between the investigation and defense teams. It also liaises with internal parties (particularly the CDU for decisions and the CRO/ERM function for enterprise support) and with external operational parties (such as law enforcement or government services, depending on context). The steering team gives a business sense to the technical information and provides key elements to prepare a response to the attack across all its dimensions. It passes on relevant information to internal and external communication teams and can also validate communications to ensure that information’s technical accuracy and that such information is safe to disclose.

These teams work hand in hand. Investigation provides elements to defense that then put forward plans for steering to approve. Steering follows the various action plans, communicates with all the other concerned parties, and drives the work forward. It must also try to anticipate as far as possible the crisis’s next steps by identifying the most likely scenarios that could develop in relation to known attack cases.

The size of these teams may vary widely. A simple attack such as the defacing of Web pages with rhetoric, can mobilize from two to three people sharing the different roles. A more complex attack, bringing about, for example, loss of control of several systems and in particular the information system management infrastructure (such as the active directory), can mobilize tens of people internally and externally for several weeks. The resolution of a complex attack can take over three months and the costs can reach tens of millions of euros.

Tools and Techniques for Managing a Cyber Crisis

The crisis management teams need to have a number of tools and techniques at its disposal to efficiently manage the crisis. A first priority is a secure crisis management system (including mail, file exchange, workstations) independent from the attacked information system and administered differently to be able to carry on in the event of a major compromise or destruction of the usual system.

  • Investigation accounts within technical systems need to be created in advance and deactivated until needed. These avoid having to wait to identify system owners to start off the technical investigations.
  • Forensic software tools to analyze suspect software are required for launching the software in a risk-free and highly monitored environment (such as confinement through sandboxing).
  • Digital forensic hardware (such as certified “bit-for-bit” hard disk copying solutions) suitable for legal analysis collection requirements is required.
  • Aggregator tool(s) that collect and centralize data logs and allow interrogation of records from different systems is required.
  • Threat intelligence tool(s) and techniques are needed to undertake a far-reaching indicator of compromise (IOC) search with sharing and acquisition capabilities (for technical traces of an attack, such as the IP addresses used or malware signatures). These enable rapid assessment the scale of an attack and rapid exchange of information with peers.
  • Specialized tool constraints.

As at time of printing, most organizations do not possess these tools, particularly in the case of IOC search. As there are only some “turnkey” solutions on the commercial market, interested cybersecurity teams are forced to build ad-hoc solutions to respond to such needs. Some of the more advanced incident response service providers have made part of their toolkit available as open source solutions (for instance CERTitude from Wavestone, FIR from Societe Generale or FastIT from Sekoia).

There are several research projects underway at present to define incident response and investigation methods. Understanding of the attacker’s actions over time is an essential part of large-scale cyber crisis management where multiple people are working simultaneously on the investigations. The Diamond Model by the U.S. Department of Defense and the Kill Chain method developed by Lockheed Martin researchers are also of interest.

Cyber Crisis Management Steps

Similar to general crisis management, a full-scale cyber attack management follows four steps, being:

  1. Alert and qualification
  2. Crisis handling (by carrying out an investigation and a defense plan)
  3. Execution and surveillance
  4. Crisis closure

The key difference for cyber over general crisis management lies in the cyber specificities, especially regarding how to stop the attack. This section details these specificities within the context of cyber crisis management steps and timings as visualized in Figure 12.1.

Chart shows four steps alert and qualification, handling, execution and surveillance, and closure which describes crisis of teams, investigations, defense plan and execution, and crisis closure.

Figure 12.1 Cyber crisis management steps

Alert and Qualification

A first incident, whether internal or reported from outside, is enough to trigger the alert. It has to be qualified by the security teams in order to identify its severity and to dispel any doubts. Qualification is based both on system or data sensitivity, on the threat’s technical level (e.g., using standard or homemade malware) and on the risk of the incident’s spread beyond the initial scope of discovery. If the first analyses show early signs of a well-prepared attack and the target’s sensitivity level is high, it is mandatory to trigger the crisis management mechanism using the predefined process.

Crisis Handling: Carrying Out the Investigation and Building a Defense Plan

Once the incident has been qualified, the teams in the cybersecurity crisis unit (i.e., investigation, defense, steering) will begin to investigate and prepare a defense plan.

Starting Investigations

The first team to mobilize is the investigation team. This team deploys the necessary technical means for the investigations. It must respect the principle of absolute discretion in its investigative actions to avoid revealing to the attacker that it has been discovered. The action generally takes several days to bear fruit, sometimes even several weeks in the case of large systems. Gray areas can last for a long time depending on the attacker’s ability to cover its tracks. In fact, it is often necessary to leave the attacker to develop freely for a few days in order to understand its modus operandi and be able to correctly comprehend its objectives, its level of technical skill, and its tools. The services of bailiffs can often be required to assess the collection of technical traces and track actions in order to remain capable of going through with any legal proceedings. The investigation team progressively prepares an investigation report that sets out its understanding of the attack and its purpose. This report summarizes information about the attacker, the attack’s compromise and spread vectors, and the impacted perimeter. It can be used as a basis for legal action such as filing a complaint or notifying the authorities.

Building the Defense Plan

The defense team is mobilized next. Its first actions are to identify the scope of the emergency zone by listing the critical assets that must not be compromised under any circumstances, and launching immediate and unconditional actions to repel the attacker—even if their effect is partial and imperfect. This represents an emergency-button type of procedure.

The team’s next responsibility is to prepare the defense plan. This contains all the countermeasures needed to eradicate the attack on the impacted perimeters. An appropriate set of countermeasures is deployed all at once in order to prevent the attacker from returning quickly through a nonsecured route. Another set of organizational or technical measures may be positioned over time. These measures may include severing network links or Internet access, isolation of certain business entities, deployment of security patches or new software, changing passwords, and installation of new protective equipment.

The defense plan is dynamic and evolves depending on information from the investigation. At a minimum, it needs to specify the actions to be carried out in the short term and medium term, and ideally in the long term. It also has to identify the actors responsible for these actions, the impacts of their implementation, and finally to follow the execution timeline and progress of these actions once the escalation to crisis management is triggered.

Preparing a defense plan can take from a few hours to several days, although draft defense plans can be elaborated during rehearsal and war games. This depends on the complexity of affected systems, the number of business areas concerned and the reliability of the information coming out of the investigations. The defense team communicates the most critical elements to the steering team, who arbitrate over the impacts and costs with the CDU.

Executing the Plan and Surveillance

The management decision by the CDU to execute the defense plan is certainly the most complex and critical one to make during an information system attack crisis. Executing it may signify a slowdown or even a halt to some organization services, complicating investigation and also revealing to the attacker that its attack has been discovered.

Except in case of emergencies (i.e., the emergency-button procedure), the plan is launched when the investigation team considers they have near-full or full visibility on the attack, and the defense plan is rated as feasible and optimally effective and efficient. The deployment of the plan needs to ensure smooth functioning and solid efficiency by the investigation team using heightened monitoring. Launching the plan can also lead to the deployment of internal or external communication plans based on the visibility or reach of the actions.

Three scenarios are foreseeable from experience with past crises. These depend on the feedback from heightened monitoring. They are:

  1. The threat has been eradicated. The attacker no longer has access to the information system. The situation is back under control.
  2. The threat has returned. The attacker accesses the information system via a different modus operandi that was not previously observed or discovered during the investigations. It is therefore necessary to restart the investigation and defense processes, being aware that the attacker knows it has been discovered.
  3. The threat evolves. The attacker launches new actions, which could go as far as attempted mass destruction of the information system (e.g., the wiping of servers and all data) in vengeance or to hide the tracks of its actions.

These scenarios—regardless of their likelihood ratings—need to be anticipated in the defense plan. If “mass destruction” begins, the drastic but considered response of an entire shutdown of the organization information system must be considered by management.

If the defense plan has been carried out successfully, it is necessary to start a return to normal. The reopening of services interrupted or impaired during crisis is organized in coordination with the business lines. This reopening can begin only if the services have been restored to a secure state to prevent the attack recurring.

Crisis Closure

The crisis unit may be stood down on three conditions: once the defense plan has been executed, the systems are back up and running, and if there is no indication of an upsurge or recurrence of the attack. This action must balance speed of normalization with alertness to the return of the attack and threat. Monitoring actions need to carry on long term to be capable of identifying any comeback.

One lesson from past attacks is that certain investigative actions bear results only after several days or even weeks. So what is discovered then can lead to remobilizing the recently dismantled crisis units. In addition, once the attacker has been discovered or driven away, it could deliberately hide him/her to come back stronger later on.

A special remediation project integrating security from the outset is required to drive remediation. This depends on the degree of reconstruction required on the affected information systems. An enterprise debriefing phase, often led by the ERM function or CRO, is also necessary in order to identify all the lessons learned from the crisis.

Conclusion

The following cyber risk management statement represents those organization capabilities CEO and board expect to be demonstrated in terms of incident and crisis management.

About CLUSIF

CLUSIF is the largest association of professionals in France dedicated to information security. It brings together users and providers from all industry branches. Its main goal is to facilitate the exchange of know-how and competences towards an efficient information security management.

About Gérôme Billois, CISA, CISSP and ISO27001 Certified

Gérôme is a board member of the cybersecurity practice of the consultancy Wavestone and an administrator of CLUSIF. Since 2001, he has lead projects within multinational companies to tackle cybersecurity challenges, including cybercrime fighting, strategy, and governance definition. He created CERT-Wavestone in 2013 and took part in several large cybersecurity crises driving the investigation and defense teams and dealing with issues at board level. He is currently focused on defining a new security model to embrace digital transformation while protecting valuable assets. He graduated from the engineering school at INSA de Lyon France. He is a regular international conference presenter and media spokesperson and a co-author of “Cyber Security of Industrial Control Systems: How to get started?” © CEPADUES 2014 and “Security and Personal Data Breach” © LARCIER 2016.

About Wavestone

Wavestone is a consulting firm, created from the merger of Solucom and Kurt Salmon’s European Business (excluding retails and consumer goods outside of France). Wavestone’s mission is to enlighten and guide their clients in their most critical decisions, drawing on functional, sectoral, and technological expertise.

With 2,500 employees across four continents, the firm is counted among the lead players in European independent consulting, and number one in France. Wavestone holds one of the largest cybersecurity and digital trust practice in EMEA (Europe, Middle East, and Africa) with more than 400 consultants.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset