The risk treatment requirements obtained during an information risk management programme should never be actioned without having the full approval and support of senior management (often at board level), so that the correct levels of funding and people resources can be allocated without causing problems for the organisation.
One of the main avenues for establishing the support of senior management is the preparation and presentation of business cases, which will set out the risks in terms the business will understand and make clear the costs of remediation.
Following the approvals to proceed, the process of detailed decision-making, planning and implementation may begin. There may also be a spin-off in the form of a business continuity exercise, which might include DR planning.
The process of communicating within the information risk management programme is extremely important, and serves a number of purposes. It allows the information risk management programme manager to:
It is often said that senior executives will never understand information risk, but this is not entirely correct. They may not understand the technicalities of information security, but business risk is something they will definitely understand, so the streetwise information risk management programme manager will ensure that all reporting is couched in terms of risk to the organisation and the business benefits to be gained by avoiding, transferring, reducing or accepting it.
Business cases are a standard vehicle for demonstrating a genuine need to carry out some form of activity that will require senior management approval. They are generally used in those circumstances in which a significant financial spend is proposed (beyond that of day-to-day budgets), and in the case of an information risk management programme will most frequently be brought into play in gaining approval to carry out risk reduction or modification, although some aspects of risk transfer or sharing will also require board-level agreement.
In some situations, the business case might present senior management with a clear and simple ‘yes or no’ decision, while others might involve a number of options with a recommendation for a specific approach that, in the view of the information risk manager, represents the most appropriate solution combined with good value for money. In the case of the latter, the senior management team will be required to choose their favourite option, and the contents of the business case will heavily influence this choice.
It follows, therefore, that the business case should be as comprehensive and compelling as possible, so that senior management’s decision-making process is made completely straightforward and that they make a fully informed choice.
There is no generic set format for a business case. Some organisations have their own template, whereas others allow a free format of presentation. In this section, we suggest some of the essential components of the business case and describe how best to present it.
Many people will be familiar with the name of Robert Maxwell, who ran a vast publishing business empire that included newspapers such as The New York Times, The Daily Mirror, The Scottish Daily Record and The European. Whatever his faults, he adopted a very simple approach to business cases: he relied on his senior management team to pull together the best advice and to present this to him in as short a time as possible.
When it came to receiving his formal approval, a single sheet of A4 was all he needed to read, written in a 14-point Courier typeface, with 1.5 line spacing, and a signature line near the bottom of the page followed by the words ‘Approved. R Maxwell, Chairman’. Supporting information was always stapled behind this, but he rarely studied it.
Most senior executives do not have the time to read large amounts of detail and, since information security is not usually their strongest point, might find it difficult to follow. What they do need are clear, concise facts: the issue, the proposed solution, the costs, the benefits to be gained and, if necessary, the downsides of not choosing the recommended option.
It is suggested that a business case document should contain the following sections:
Many organisations prefer a personal briefing as well as a business case document, in which case a slide presentation – probably no more than 10 slides – should be prepared and delivered by a programme representative who feels comfortable presenting to very senior managers and who can also answer penetrating questions without the need to refer to detailed notes.
Whichever approach is taken, the person presenting the business case would be well advised to socialise the business case beforehand with as many members of the approving committee as possible, so that it is approved ‘on the nod’. This approach has another advantage, in that many of the questions that might be asked during a presentation will either be known or answered beforehand, and any last-minute changes to the business case that will assist in gaining approval can be included.
RISK TREATMENT DECISION-MAKING
The decision-making process for risk treatment follows a logical path. It begins by identifying the strategic option or options that the organisation should take – risk avoidance or termination; risk transfer or sharing; risk reduction or modification; and risk acceptance or tolerance – and this part of the process will have been taken care of during the final stage of risk assessment, risk evaluation.
The next step for each of the chosen strategic approaches is to identify the tactical options. These will depend completely on the strategic approaches, but will be as follows:
Having identified the tactical risk treatment options, the final stage is to identify the operational options:
RISK TREATMENT PLANNING AND IMPLEMENTATION
It is quite conceivable that many of the risks requiring treatment as part of the information risk management programme can undergo treatment as an integral part of the programme. However, some risks might require extensive (or expensive) treatment, and as such may need to be treated as a project or programme of work in their own right.
However, although the implementation may be carried out under a separate project or programme, progress reporting of the implementation should remain part of the original information risk management programme so that the audit trail is complete.
Such a project requires the setting of goals, objectives, scope and milestones, which, given the controls recommended and agreed earlier in the information risk management programme, should be relatively straightforward to define.
The risk treatment plan should commence with the production of a prioritised list of risks for treatment, which includes realistic estimates of the length of time these might take to achieve, the approximate cost of the treatment and the resources required (including the name of the responsible person) for doing so. By totalling the number of completed risk treatments and the running costs, additional information can be reported to senior management.
Regardless of whether the project is to be managed from within or outside the main information risk management programme, resources, especially people and funding, must have been agreed and committed by the organisation. This will include a suitably qualified alified project manager, who may be a different entity from the information risk management programme manager, particularly if the project is significant in its scope; for example, if the agreed control is for the provision of an entire backup data centre with high-availability standby systems, this would be a major project in its own right, and would certainly require at least one dedicated project manager, if not several.
However, even if the remedial work to implement the agreed controls is relatively minor, each individual control should be considered as a task within an overall project, so that it can have resources assigned to it and be tracked to completion and sign-off.
BUSINESS CONTINUITY AND DISASTER RECOVERY
Occasionally, the controls recommended may be very wide-ranging, such as the need for business continuity management (BCM) and DR arrangements, which are specialist subject areas in their own right. However, it is worth providing a brief description of both approaches.
Business continuity
The concept of BC became better known in 2006 with the introduction of the first full standard, BS 25999-1, the Code of Practice, and then its Specification, BS 25999-2, in 2007. Prior to that, there had only ever been a publicly available specification, PAS 56, published in 2003 and developed from an early Business Continuity Institute (BCI) Good Practice Guidelines document.
The two BS 25999 standards were superseded in 2012, and the international standard ISO 22301:2019 – Societal security – Business continuity management systems – Requirements now applies instead.
BC is defined as ‘The capability of the organisation to continue delivery of products and services at acceptable predefined levels following a disruptive incident’ (ISO 22301:2019).
BC applies to a number of key areas within an organisation, and so is considered to be a holistic approach to risk management. It includes:
At first sight, it would appear that information is just one of these areas, but it actually cuts across all of the remainder, and hence the principles explored in information risk management are fundamental to the discipline of BCM.
The Business Continuity Institute Good Practice Guidelines 2018
Founded in 1994, the BCI has always been at the forefront of business continuity standards development, and was instrumental in the first UK specification PAS 56, published in 2003. Its members have subsequently taken a leading role in the later development of BS 25999 in 2006/7 and ISO 22301 in 2012 and beyond.
Over the years, the Institute has developed a set of good practice guidelines (GPGs) that define the generic approach to BCM in six distinct stages, or so-called Professional Practices (PPs):
PP1 Policy and Programme Management – this is the beginning of the overall BCM life cycle, and defines the organisation’s policy for BC: how it will be implemented, managed and tested.
PP2 Embedding – it is important that the culture of BCM is embedded into day-to-day operations within an organisation.
PP3 Analysis – in earlier versions of the Good Practice Guide, this was known as Understanding the Organisation, and assesses the organisation’s overall objectives, how it functions and the internal and external context within which it operates. It includes the risk assessment process of risk management.
PP4 Design – formerly known as Determining Business Continuity Strategy, this area recommends suitable approaches (both strategic and tactical) to recover from disruptive events and to provide continuity of operations.
PP5 Implementation – this area was previously known as Determining and Implementing a BCM Response, and carries out the recommended and agreed approaches through the development of business continuity plans (BCPs). Together with Design, this area aligns with the risk treatment portion of risk management.
PP6 Validation – validation was originally referred to as Exercising, Maintaining and Reviewing, and deals with the validation of BC plans through tests and exercises to ensure that they are fit for purpose and would be effective in disruptive situations.
Professional Practices 1 and 2 are described as management practices, whereas Professional Practices 3 to 6 are described as technical practices. The BCI’s life cycle diagram illustrates this graphically in Figure 8.1.1
These are by no means mandatory requirements, but most BC practitioners – and not only in the UK – will follow them, since they provide considerable assistance when an organisation wishes to become compliant with the standard and to achieve accreditation against it.
Business continuity plans
BCPs produced will normally include:
Although BC itself is generally thought of as being a form of risk reduction or modification, a BC programme of work may well make use of all forms of strategic, tactical and operational controls in order to achieve its objectives. Figure 8.2 illustrates the generic BC incident timeline.
Once IM, BCM, DR and BR plans have been developed, they must be tested in order to prove their fitness for purpose.
BC introduces some terminology that is not generally used in information risk management. However, when implementing a BC strategy as part of the treatment process for an information risk management programme, it is worthwhile being aware of these terms:
Recovery point objective (RPO). The point to which information used by an activity must be restored to enable the activity to operate on resumption.
Recovery time objective (RTO). The period of time following an incident within which products, services or activities must be resumed or resources must be recovered.
Maximum acceptable outage (MAO). The time it would take for adverse impacts, which might arise as a result of not providing a product/service or performing an activity, to become unacceptable.
Maximum tolerable data loss (MTDL). The maximum loss of information (electronic and other data) that an organisation can tolerate. The age of the data could make operational recovery impossible, or the value of the lost data is so substantial as to put business viability at risk.
Maximum tolerable period of disruption (MTPD). The time it would take for adverse impacts, which might arise as a result of not providing a product/service or performing an activity, to become unacceptable.
Minimum business continuity objective (MBCO). The minimum level of services and/or products that is acceptable to the organisation to achieve its business objectives during a disruption.
Various types of test may be undertaken:
BC is invariably conducted as a separate programme of work from that of information risk management, since it may have much wider implications for the organisation, especially in terms of the resources required to operate the programme and to exercise the plans.
Disaster recovery
DR is a specialised subset of BC, and is generally used to refer to the arrangements put in place to provide backup or recovery computing facilities, although it can refer to other forms of technical processing. In our Glossary of Terms, we describe disaster recovery as ‘A coordinated activity to enable the recovery of ICT [information, communications and technology] systems and networks due to a disruption’.
Some organisations make use of system hardware normally used for software application testing purposes to provide DR: sometimes on a one-for-one basis, so that the standby hardware is identical to the system being replicated, sometimes on a one-to-many basis, where one backup system can be used to provide DR for a number of live systems. Figure 8.3 illustrates the overall structure of DR operations:
Figure 8.3 Overall structure for disaster recovery
Platform DR generally involves the use of one or more of the three following types of facility.
Cold standby platforms These consist of bare computer systems and associated communications equipment. They may have an operating system loaded, but little else. The organisation or its outsourced DR partner will be responsible for loading any additional operating systems and applications software required in order to operate the system in the same way as the one it is replicating. In addition, all data must be restored from backup media, and the organisation will need to take into account any patches or software updates that have been issued.
Because these systems are very basic, they represent the lowest cost to an organisation, and take the longest amount of time to bring up to full operation.
Warm standby platforms Warm standby systems invariably have their full operating system and key applications loaded, and may have some backed-up data loaded as well. However, unless the system has been maintained in a fully ‘ready’ state, the organisation will need to take into account any patches or software updates that have been issued since the system was originally configured. Data will have to be brought fully up to date by restoring from the most recent backups.
Warm standby systems are more expensive to provide than cold standby systems, and can normally be brought into service much more quickly.
Hot standby/high-availability platforms At the top end of the DR range, there are hot standby or high-availability platforms, which are always maintained in a fully ready state from the point of view of operating systems and application software. Data will also be fully up to date, since the system being replicated will copy across all data onto the standby system.
These vary in type and cost, as can be seen from Figure 8.4. Availability is measured in ‘nines’, with five nines, that is, 99.999%, general availability being the highest, which allows for five minutes’ downtime in any 12-month period. Unsurprisingly, higher availability comes with a greatly increased cost, and each ‘nine’ added would probably increase the cost tenfold.
In cases where two systems operate jointly to deliver the service, data are copied between the live and the standby system in one of two ways:
Synchronous replication is slightly slower than asynchronous replication, but has greater reliability, since no data can be lost at the point of switchover. The distance between the live and standby locations cannot currently be greater than around 200 km (125 miles), and typically uses a direct fibre-optic link, which guarantees capacity as well as reliability.
Figure 8.4 Cost versus availability
High-availability systems are by far the most costly to operate, but for organisations such as banks, large online retail organisations, airlines and the like, failure of service and possible loss of information is simply not an option.
In conjunction with platform DR, organisations should take into account four key areas:
Systems and service monitoring is always required, so that remedial action can be taken as soon as there is a failure in any part of the service being provided. In larger organisations, the internal and external data networks are usually monitored in addition to platforms and services.
Data resilience
Although data storage has moved on considerably in recent years, magnetic media of one kind or another remains the most cost-effective technology, although it is rarely the fastest. There are several commonly used methods of providing resilient data storage:
Cloud-based storage has become so low in cost for a given volume of data stored that it is now very popular at the individual consumer level as well as that of larger organisations.
Application resilience
We are all used to experiencing applications on a home or office computer, and these occasionally fail but usually impact only the computer user. At a corporate level, application failures will affect many users, and those that provide a service for online use (for example online banking applications) can affect very large numbers of people if they fail. For this reason, application resilience is key to such services, and can be delivered in one of three ways:
Site recovery
Organisations may choose to replicate a complete equipment site in a physically separated location, usually around 30 miles (48 km) apart. This option is invariably expensive, both to provision and to maintain, but offers organisations the opportunity to provide a fully resilient service, not only to the organisation itself but also to its customers and suppliers.
In practice, organisations that make use of site recovery either do so through the agencies of a third party or will make use of space in other offices, warehouses, factories and the like. This is frequently the case for those organisations that operate a large data centre or telecommunications hub, in which site recovery for one location can be relatively straightforward to provide in another site having similar infrastructure and environmental facilities.
DISASTER RECOVERY FAILOVER TESTING
The testing of DR plans generally follows one of two paths:
Again, depending on the type of standby system implemented, a full switchover test might be disruptive, but it is the only way in which the organisation can be completely certain that its DR arrangements are fully working. However, if the standby system has been correctly implemented, everything should failover without interruption.
SUMMARY
In this chapter, we have examined the need for and the importance of business cases in gaining support from senior management for the information risk management programme, together with the processes of risk treatment decision-making, planning and implementation. Finally, we have looked at the requirements for business continuity planning, based on the BCI’s approach, and, where applicable, various types of solution to disaster recovery planning and testing.
We will look next at communicating, monitoring and reviewing activities.
1 The BCI life cycle diagram is included courtesy of the BCI.