Images

Domain 5

Technology Related Business Continuity Planning (BCP) & Disaster Recovery Planning (DRP)

“RECENT WORLD EVENTS have challenged us to prepare to manage previously unthinkable situations that may threaten the organization’s future.

The new challenge goes beyond the mere emergency response plan or disaster management activities that we previously employed. Organizations must now engage in a comprehensive process best described generically as Business Continuity.

Today’s threats require the creation of an on-going, interactive process that serves to assure the continuation of an organization’s core activities before, during, and most importantly, after a major crisis event …” (ASIS 2005)1

Business continuity planning provides a quick and smooth restoration of operations after a disruptive event. Business continuity planning is a major component of risk management. Business continuity planning includes business impact analysis, business continuity plan (BCP) development, testing, awareness, training, and maintenance.

A business continuity plan addresses actions to be taken before, during, and after a disaster. A BCP spells out in detail the what, who, how, and when of the plans required to be followed in the event of a disaster striking the business in order to ensure that all required resources and efforts are directed towards one common goal. It requires a continuing investment of time and resources. Interruptions to business functions can result from major natural disasters such as tornadoes, floods, and fires, or from man-made disasters such as terrorist attacks. The most frequent disruptions are less sensational — equipment failures, theft, or employee sabotage. The definition of a disaster, then, is any incident that causes an extended disruption of business functions.

Business continuity planning is the process whereby institutions ensure the maintenance or recovery of operations, including services to customers, when confronted with adverse events such as natural disasters, technological failures, human error, or terrorism. The objectives of a business continuity plan (BCP) are to minimize financial loss to the institution; continue to serve customers; and mitigate the negative effects disruptions can have on an institution’s strategic plans, reputation, operations, liquidity, credit quality, market position, and ability to remain in compliance with applicable laws and regulations. Changing business processes (internally to the institution and externally among interdependent services companies) and new threat scenarios require institutions to maintain updated and viable BCPs.

Business Continuity Planning is a comprehensive and often complex discipline that delves deeply into the business as a whole. In fact, many IT departments provide a support function to the business overall, while many businesses have separate departmental systems for specific business functions. Both the overarching IT systems functions such as e-mail and telephony, as well as departmental specific systems such as employee badge creation, or benefits administration must all be assessed equally when creating a Business Continuity Plan / Disaster Recovery Plan solution for the business.

Business continuity planning (BCP) and disaster recovery planning (DRP) involve the identification of adverse events that could threaten the ability of the organization to continue normal operations. Once these events are identified, the security architect will implement countermeasures to reduce the risk of such incidents occurring. Furthermore, the security architect will play a key role in designing and developing business continuity plans that will meet the operational business requirements of the organization through planning for the provisioning of appropriate solutions. Key areas of knowledge include:

  1. Evaluating recovery requirements and strategy

  2. Designing and developing business continuity plans

  3. Assessing the business continuity plan and disaster recovery plan

TOPICS

Images   Incorporate Business Impact Analysis (BIA)

Images   Legal

Images   Financial

Images   Stakeholders)

Images   Determine Security Strategies for Availability and Recovery

Images   Identify solutions

-   Cold

-   Warm

-   Hot

-   Insource

-   Outsource

Images   Define processing agreement requirements

-   Reciprocal

-   Mutual

-   Cloud

-   Outsourcing

-   Virtualization

Images   Establish recovery time objectives and recovery point objectives

Images   Design Continuity and Recovery Solution

Images   High availability, failover and resiliency

-   Communication path diversity

-   Paired deployment

-   Pass-through network interfaces

-   Application

Images   Availability of service provider/supplier support

-   Cloud

-   SLAs

Images   BCP/DRP Architecture Validation

-   Test Scenarios

-   Requirements Traceability Matrix

-   Trade-Off Matrices

OBJECTIVES

The security architect needs to understand the business continuity and disaster recovery domain to ensure development of plans for the protection and recovery of the critical IT infrastructure. This Domain focuses on the technology recovery strategies in support of the overall business continuity program.

Images   BCP is defined as:

Images   Preparation that facilitates the rapid recovery of mission-critical business operations

Images   The reduction of the impact of a disaster

Images   The continuation of critical business functions

Images   DRP is defined as:

Images   A subset of BCP that emphasizes the procedures for emergency response relating to the information infrastructure of the organization

Images   DRP includes:

-   Extended backup operations

-   Post-disaster recovery for data center, network, and computer resources

A business continuity plan is the tool that results from the planning and is the basis for continued life-cycle development [Waxvik 2007]. Continuity planning is a significant management issue and should include all parts or functions of the organization. Together, BCP and DRP ensure adequate preparations and procedures for the continuation of all business functions.

Planning Phases and Deliverables

Whether developing a new plan or updating an older one, the following planning phases are recommended:

  1. Identify the planning team and critical staff.

  2. Validate vital records.

  3. Conduct risk and business impact analyses.

  4. Develop recovery strategy.

  5. Select alternate sites.

  6. Document the plan.

  7. Test, maintain, and update the plan.

Planning Team and Critical Staff - The security architect will need to identify and build contact lists for the planning team, leadership, and critical staff. The deliverable from this phase is the Emergency Notification List (ENL).

Vital Records - Validating that all the records needed to rebuild the business are stored off-site in a secure location that will be accessible following a disaster. This includes backups of your technology as well as paper records. The deliverable from this phase is a list of the vital records, where they are stored off-site, how to retrieve them, and who is authorized to retrieve them.

Risk Analysis and Business Impact Analysis - This is where decisions are made about what risks will be mitigated and which processes will be recovered and when. The deliverable from this phase of the planning is a list of risks by site and recommendations to be implemented to reduce the impact of the risk. The security architect has a responsibility to help guide the organization to focus on those risks that will be addressed through planning at some level, and to clearly identify which risks will not be mitigated because of the costs involved in doing so.

Strategy Development - In this phase, the security architect will review the different types of strategies for the recovery of business areas and technology based on the recovery time frame(s) that have been identified for each, do a cost–benefit analysis on the viable strategies to be selected, and make a proposal to leadership to implement the selected strategies. The deliverable from this phase is the recommended strategies for recovery.

Alternate Site Selection and Implementation - During this phase, the security architect selects and builds out the alternate sites to be used in the event of a recovery. The deliverable from this phase is a functional alternate site2.

Documenting the Plan - This phase is where all the information collected up to this point is combined into a formal plan document. The deliverable from this phase is the documented recovery plan for each site.

Testing, Maintenance, and Update - During the final phase is where the security architect will validate the recovery strategies implemented through testing, establish a maintenance schedule for the plan, and an update schedule for the plan documentation. The deliverable from this phase is ongoing results from validating the plan.

Risk Analysis

As part of the planning process, the security architect will need to perform a risk analysis or assessment to determine the threats to which the organization is vulnerable and where mitigating investments should be applied to attempt to reduce the impact of a threat. To do this, security architects need to determine the risks faced by the organization. This includes risk from natural hazards, industry risks, and environmental risks.

The security architect needs to look at natural hazard risks based on the location of the organization; industry risks based on the organization’s type of business or mission; crime risks based on the geographic location(s) of the organization; human-made hazards such as transportation accidents based on proximity to highways, train lines, airports, etc.; proximity risks based on other industries near where the organization is located such as chemical plants, and natural gas storage facilities; and then recommend mitigating strategies to protect the business where appropriate. Once the risk analysis is completed, the security architect will need to document the findings. A sample risk model from the US National Institute of Standards and Technology (NIST) is below:3

Images

Documentation of the risk analysis is a critical success factor with regards to risk mitigation. If the risk analysis is not clearly presented to the business, allowing for key decision makers to understand the risks that are likely to occur, and those that are not very likely to occur, then risk mitigation cannot take place. The output of the risk analysis will require the security architect to create a matrix to represent the risks found, and the likelihood of their occurrence. A sample risk analysis matrix can be seen in Figure 5.1.

Images

Figure 5.1 - Risk Analysis matrix

The security architect will also need to create a description of the rankings used in the risk analysis matrix to help explain what each rating will mean. An example of a risk level explanation can be seen in Table 5.1

In addition, the security architect will also want to document any specific terms that are used to describe state or activity within the risk analysis being presented to the business. For instance, in the risk level explanation for the risk analysis matrix presented in Table 5.1, the terms Confidentiality, Integrity, and Availability are used with regard to “ loss “, but are not clearly defined for the reader. In order to ensure that there is no confusion on the part of anyone who would be examining the risk analysis documentation, the security architect would also want to provide a definition table of terms for the reader, as shown in Table 5.2

Risk Level

Risk Description

Extreme

The loss of confidentiality, integrity, or availability could be expected to have a catastrophic adverse effect on organizational operations, organizational assets or individuals.

High

The loss of confidentiality, integrity, or availability could be expected to have a severe adverse effect on organizational operations, organizational assets or individuals.

Medium

The loss of confidentiality, integrity, or availability could be expected to have a serious adverse effect on organizational operations, organizational assets or individuals.

Low

The loss of confidentiality, integrity, or availability could be expected to have a limited adverse effect on organizational operations, organizational assets or individuals.

Table 5.1 - Risk Level explanation for a risk analysis matrix

Security Objective

Low

Medium

High

Extreme

Confidentiality

Preserving authorized restrictions on information access and disclosure, including means for protection personal privacy and proprietary information

[44 USC, SEC. 3542]

The unauthorized disclosure of information could be expected to have a limited adverse effect on organizational operations, organizational assets, or individuals.

The unauthorized disclosure of information could be expected to have a serious adverse effect on organizational operations, organizational assets, or individuals.

The unauthorized disclosure of information could be expected to have a severe adverse effect on organizational operations, organizational assets, or individuals.

The unauthorized disclosure of information could be expected to have a catastrophic adverse effect on organizational operations, organizational assets, or individuals.

Integrity

Guarding against improper information modification or destruction, and includes ensuring information non-repudiation and authenticity.

[44 USC, SEC. 3542]

The modification or destruction of information could be expected to have a limited adverse effect on organizational operations, organizational assets, or individuals.

The modification or destruction of information could be expected to have a serious adverse effect on organizational operations, organizational assets, or individuals.

The modification or destruction of information could be expected to have a severe adverse effect on organizational operations, organizational assets, or individuals.

The modification or destruction of information could be expected to have a catastrophic adverse effect on organizational operations, organizational assets, or individuals.

Availability

Ensuring timely and reliable access to and use of information.

[44 USC, SEC. 3542]

The disruption of access to or use of information or an information system could be expected to have a limited adverse effect on organizational operations, organizational assets, or individuals.

The disruption of access to or use of information or an information system could be expected to have a serious adverse effect on organizational operations, organizational assets, or individuals.

The disruption of access to or use of information or an information system could be expected to have a severe adverse effect on organizational operations, organizational assets, or individuals.

The disruption of access to or use of information or an information system could be expected to have a catastrophic adverse effect on organizational operations, organizational assets, or individuals.

Table 5.2 - Confidentiality, Integrity, and Availability Defined4

Finally, the security architect will also need to provide a risk level definition along with all other items already discussed in order to ensure that the business is able to clearly understand the magnitude of the risks being discussed, and the necessary level of response required to mitigate the risk, if it is deemed appropriate to do so. Table 5.3 illustrates what the risk level definition would look like.

Magnitude of Impact

Risk Level Definition

Extreme

There is an immediate need for corrective measures. An existing system may not continue to operate unless a corrective action plan is put in place immediately, or already exists.

High

There is a strong need for corrective measures. An existing system may continue to operate, but a corrective action plan must be put in place as soon as possible.

Medium

Corrective actions are needed and a plan must be developed to incorporate these actions within a reasonable period of time.

Low

The organization’s management must determine whether corrective actions are still required or decide to accept the risk.

Table 5.3 - Risk Level Definition

There are many samples of risk analysis templates available for download through the World Wide Web. A quick search using Google will turn up hundreds of options for the security architect to access and examine as a starting point to creating their own. Microsoft, NIST, the Virginia Information Technologies Agency (VITA), the United Nations, the US Army Core of Engineers, and the governments of Australia, Japan, Germany, France, and the United Kingdom all have sample templates available to be downloaded.5

In addition to sample templates, there are a number of Frameworks and Relevant Standards that the security architect should become familiar with. While most of these are focused on Corporate Governance and financial controls, the security architect will still play a part in supporting these frameworks, and will have to build and maintain systems in accordance with their guidance if the industrial verticals that the architect works in are required to use them. These include the following:

  1. The Criteria of Control (CoCo), a control framework issued by the Canadian Institute of Chartered Accountants (CICA) in 19956.

  2. KonTrag, (Gesetz zur Kontrolle und Transparenz im Unternehmensbereich - German Act on Control and Transparency in Business), which is a framework that promotes corporate governance in both the public and private sectors7.

  3. Committee of Sponsoring Organizations (COSO) Enterprise Risk Management (ERM) Framework:2004, which consists of eight components designed to help organizations formally organize ERM responsibilities and activities, providing a comprehensive roadmap for establishing the critical processes needed for effective risk management8.

  4. ISO 31000:2009, Risk Management – Principles and Guidelines, which provides principles, framework and a process for managing risk9.

  5. ISO/IEC 31010:2009, Risk Management – Risk Assessment Techniques, which provides guidance on selection and application of systematic techniques for risk assessment10.

  6. ISO Guide 73, Risk management – vocabulary, which provides basic vocabulary to develop common understanding on risk management concepts and terms among organizations and functions, and across different applications and types.11

Natural Hazard Risks12

There are many different types of natural hazard risks organizations and individuals may face today. The security architect needs to be able to provide guidance on known risks within the theater of operations that an organization covers. If that is localized to a small area, then the security architect’s job becomes focused specifically on the risks associated with that well-defined area. Information and resources from local or municipal government agencies would be used by the security architect during the planning cycle to ensure identification and exposure of known risks. If on the other hand, the organization is regional, or national, or multi-national in its coverage, then the security architect needs to draw information in from a variety of sources in order to ensure that as many identified risks as possible are exposed in the planning cycle, and are addressed through mitigation efforts based on a cost benefit analysis13. Within the United States, security architects can check with the U.S. Geological Survey (USGS) for a natural hazards map of specific areas with regards to risks such as earthquakes, volcanic activity, and landslides14.

For security architects outside of the United States, or those that have a multi-national focus due to the nature of the organizations they work for, there are many good starting points for risk identification and planning activities, from the United Nations Office for Disaster Risk Reduction (UNISDR)15, to frameworks such as the Hyogo Framework for Action16, to the many national and international research programs such as those that support the Charter On Cooperation To Achieve The Coordinated Use Of Space Facilities In The Event Of Natural Or Technological Disasters Rev.3 (25/4/2000)17, and the PREVIEW program of the European Commission18. Some common natural hazards include:

Images   Earthquake

Images   Tornado

Images   Floods

Images   Hurricane

Images   Ice storms

Images   Blizzards

Images   Tsunami

Images   Cyclone

Images   Drought

Images   Dust storm

Images   Flash Flood

Images   Fog

Images   Heat Wave

Images   Lightning

Images   Rain

Images   Snow

Images   Thunder

Images   Tornado

Images   Tropical storm

Images   Water Spout

Images   Wind

Images   Wind Storm

Images   Fire Storm

Images   Fire - Wild, Rural or Urban

Human-Made Risks and Threats

Human-made risks may also be called “man-made” risks or anthropogenic hazards. There are many different areas where the security architect will find risks and threats as potential liabilities that will need to be addressed. The category of human-made risks and threats is a very broad one, as there are so many things that can cause a risk event to occur. There are many methodologies available to the security architect to help with the assessment and measurement of human-made risks and threats. Some examples include the approaches presented in the overview briefing paper “ Development of a methodology to assess man-made risks in Germany”19, the briefing document “Natural and man-made disaster risks of Kabul city”20, the research work being done by universities around the world, such as the University of Cambridge Judge Business School’s Centre for Risk Studies21, the Université de Strasbourg through the REseau Alsace de Laboratoires en Ingénierie et Sciences pour l’Environnement (REALISE) network22, the “ Information Technology Sector Baseline Risk Assessment (ITSRA)”23, and the “ Risk Management Strategy – Internet Routing, Access and Connection Services”24 reports, both produced by the Information Technology Sector Coordinating Council (ITSCC), the Information Technology Government Coordinating Council (ITGCC) and the United States Department of Homeland Security. Some common human-made risks and threats include:

Images   Terrorism

Images   Bio-Hazard

Images   Biological

Images   Chemical

Images   Nuclear

Images   Epidemic/Pandemic

Images   Theft/Vandalism

Images   Work Stoppage

Images   Riot

Images   Power/HVAC Failure

Images   Systems Configuration Error(s)

Images   Security Updates for systems not kept up to date

Images   Communications Failure

Images   Hardware Failure

Images   Software Failure

Images   Security Incident

Images   Individual Behavior

Images   Mass Behavior

Images   Hijacking of an individual, a VIP, or a Group

Images   Assassination

Images   Torture

Images   Poisoning

Images   Wounding

Images   Bomb

Images   Bomb Threat

Images   (IED) Improvised Explosive Device

Images   Car Bomb

Images   Suicide Bomb

Images   Cyber attacks

Images   Espionage

Industry Risks25

Some risks are associated with the business or mission of an organization. Convenience stores face a threat of robbery. Banks may not only face robbery but also need to be concerned about money laundering. Department stores are frequent victims of shoplifting and also need to worry about identity theft. Insurance companies sometime face threats of workplace violence from claimants who are dissatisfied with the handling or a claim. Every industry will have its own specific risks associated with it, based on a variety of factors that are unique to those industries. As a result, the security architect will want to engage industry expertise to help them better assess the specific risks associated with the industry or mission in question as they engage in risk analysis and risk assessment activities. Some common industry risks are as follows:

Images   Robbery and theft

Images   Workplace violence

Images   Money laundering

Images   Identity theft

Images   Theft of trade secrets and Intellectual Property

Images   Fraud

Images   Supply Chains

Images   Loan defaults

Images   Market risk

Images   Credit risk

Images   Labor disputes

Do Not Forget the Neighbors!

There are many different entities that an organization may become neighbors with, depending on where the organization chooses to locate. Some of these neighbors will be fairly benign in nature and as a result, will not pose significant risks and threats to the business; as a matter of fact, they may even provide unforeseen benefits to the organizations locating near them, such as enhanced security and surveillance capabilities, upgraded infrastructure such as roads and bridges, rail and port facilities, and air terminals capable of handling large cargo capacities and jumbo jets. Other neighbors may prove to be more of a potential source of risks and threats due to the nature of the kinds of activities carried out on site. It is the security architect’s responsibility to ensure that the business is aware of all of the potential risks and threats, as well as any potential benefits arising from a choosing to locate in any particular area, as required based on circumstances. Some potential neighbors that the security architect may have to evaluate include:

Images   Nuclear power plants

Images   Civil Defense / Military installations

Images   Intelligence gathering installations

Images   Oil storage facilities

Images   Hazardous waste producers

Images   Chemical factories

Images   Biomedical research facilities

Some of the events are fairly localized, such as a facility fire, and others, such as a hurricane, have a more regional impact. These are important factors in the risk consideration. A regional risk can affect not just the business but the homes and families of its employees, and can cause competition for the availability of contracted alternate sites.

Once a risk has been identified and analyzed, the security architect will need to make choices about how to respond to that risk: accept it, transfer it, reduce /mitigate it, or avoid it.

Risk Acceptance

If the risk of occurrence is so small or the impact so minimal or the cost to mitigate it so substantial, the security architect can recommend to the organization that the risk be accepted.

Risk Transfer

When a risk is too costly to mitigate, but too big to just accept, the security architect can choose to recommend that the organization transfer the monetary risk by purchasing an insurance policy. Similar to car insurance, business interruption insurance is often used to transfer the monetary risk of an event that cannot be mitigated because of either cost or some other factor. The security architect should understand that while the monetary exposure of the organization may be covered as a result of the transference event, other areas of risk such as reputation and goodwill must still be considered.

Risk Reduction / Mitigation

Putting controls in place to prevent the most likely of risks from having an impact on the ability to do business leads to having fewer actual events from which to recover. The risks that the security architect needs to address are the ones most likely to occur. A business continuity plan is one type of mitigation strategy. In fact, a business continuity plan is what is implemented when all other mitigating factors have failed.

Risk Avoidance

There are certain risks that can be dealt with by not dealing with them in the first place. The security architect can guide the business away from activities and behaviors that lead to risks that are best dealt with by not allowing them into the business in the first place. These risks may be too big to contain if realized, and too costly to mitigate, accept, or transfer based on their potential impacts on the business.

Business Impact Analysis

The Business Impact Analysis (BIA) attempts to determine the consequences of disruptions that could result from a disaster and guides the organization’s decision regarding what needs to be recovered and how quickly it needs to be recovered. The BIA is the foundation of the plans that will be built for the business.

Business Impact Analysis

The Business Impact Analysis (BIA) attempts to determine the consequences of disruptions that could result from a disaster and guides the organization’s decision regarding what needs to be recovered and how quickly it needs to be recovered. The BIA is the foundation of the plans that will be built for the business.

While performing the BIA, security architects should avoid using the term critical or essential in defining the processes or people during this phase of the planning. Instead, use the term time sensitive. Generally speaking, organizations do not hire staff to perform nonessential tasks. Every function has a purpose, but some are more time sensitive than others when there is limited time or resources available to perform them.

A bank that has suffered a building fire could easily stop its marketing campaign but would not be able to stop processing deposits and checks written by its customers. The bank’s marketing campaign is very essential to the bank’s growing its business in the long term but in the middle of a disaster, marketing will take a backseat, not because it is not critical but because it is not time sensitive.

All business functions and the technology that supports them need to be classified based on their recovery priority26. Recovery time frames for business operations are driven by the consequences of not performing the function. The consequences may be the result of business lost during the down period; contractual commitments not met, resulting in fines or lawsuits; lost goodwill with customers, etc. Impacts generally fall into one or more of these categories: financial, regulatory, or customer.

All applications, and the business functions that they support, need to be classified as to their time sensitivity for recovery even if they do not support business functions that are time sensitive. For applications, this is commonly referred to as Recovery Time Objective (RTO). This is the amount of time the business can function without that application before significant business impact occurs.

Once the business has determined the time frame for recovery of the different business operations and identified the applications that are essential to perform those functions, the security architect can establish RTOs for each of the applications to be recovered by the technology plan. The RTO will define for the technology recovery team how much time can elapse between the time the disaster occurs and the time the application is recovered and available to the business.

The business also needs to determine the amount of work in process that can be at risk during an event. The data that is on an employee’s desk when a fire occurs would be lost forever if that information was not backed up somewhere else. The information stored in file cabinets, incoming mail in the mailroom, and the backup tapes that have not yet left the building are all at risk.

Decisions need to be made about all types of data because data is what is needed to run the business. How much data is it acceptable to lose? A minute’s worth? An hour’s worth? A whole business day’s worth? The answers to these questions are used to determine the Recovery Point Objective (RPO). This is the point in time that the security architect will recover to. The vital records program, backup policies, and procedures for electronic data and hard copy data need to comply with the RPO established by the business. An example of the RTO and RPO together can be seen in Figure 5.2.

The origination of the RTO concept as a formalized recovery objective is found in the BS-25999-2 standard. BS 25999-2 defines RTO as “...target time set for resumption of product, service or activity delivery after an incident”. RTO is determined during the Business Impact Analysis (BIA), and the preparations are defined in the business continuity strategy. Recovery Point Objective is a totally different thing, as RPO is defined as the maximum tolerable period in which data might be lost. The RPO is crucial for determining one specific element of the business continuity strategy – the frequency of data backups. If your RPO is 4 hours, then you need to perform backups at least every 4 hours.

Images

Figure 5.2 - RTO and RPO Definitions

The difference is in the purpose – RTO has a broader purpose because it sets the boundaries for your whole business continuity management strategy, while RPO is focused solely on the issue of backup frequency. They are not directly related, although they do support the same common goals, as they are both crucial for creating a successful Business Impact Analysis and for successfully carrying out a business continuity management strategy – for example, you could have an RTO of 12 hours and an RPO of 1 hour, or an RTO of 2 hours and an RPO of 12 hours.

BS 25999-2 was a British standard issued in 2007, which quickly became the main standard for business continuity management – although it is a British national standard, it was used in many other countries; on May 15, 2012 BS 25999-2 was replaced by international standard ISO 22301.

BS 25999-2 defined a business continuity management system which contains four management phases: planning, implementing, reviewing and monitoring, and finally improving.

The following are some of the key procedures and documents required by BS 25999-2:

Images   Scope of the BCMS – precise identification of that part of the organization to which business continuity management is applied

Images   BCM policy – defining objectives, responsibilities, etc.

Images   Human resource management

Images   Business impact analysis and risk assessment

Images   Defining business continuity strategy

Images   Business continuity plans

Images   Maintenance of plans and systems; improvement

In addition to BS 25999-2, BS 25999-1 is an “auxiliary” standard which provides more details on how to implement specific parts of BS 25999-2.

Other useful standards are ISO 27001, which places business continuity in a broader context of information security, and ISO 27005 which gives a detailed description of the risk assessment process.

ISO 22301 is the new de-facto standard for Business Continuity Management. The full name of this standard is ISO 22301:2012 Societal security – Business continuity management systems – Requirements. This standard is written by leading business continuity experts and provides the best framework for managing business continuity within an organization.

The standard includes these sections:

Introduction

0.1 General

0.2 The Plan-Do-Check-Act (PDCA) model

0.3 Components of PDCA in this International Standard

1 Scope

2 Normative references

3 Terms and definitions

4 Context of the organization

4.1 Understanding of the organization and its context

4.2 Understanding the needs and expectations of interested parties

4.3 Determining the scope of the management system

4.4 Business continuity management system

5 Leadership

5.1 General

5.2 Management commitment

5.3 Policy

5.4 Organizational roles, responsibilities and authorities

6 Planning

6.1 Actions to address risks and opportunities

6.2 Business continuity objectives and plans to achieve them

7 Support

7.1 Resources

7.2 Competence

7.3 Awareness

7.4 Communication

7.5 Documented information

8 Operation

8.1 Operational planning and control

8.2 Business impact analysis and risk assessment

8.3 Business continuity strategy

8.4 Establish and implement business continuity procedures

8.5 Exercising and testing

9 Performance evaluation

9.1 Monitoring, measurement, analysis and evaluation

9.2 Internal audit

9.3 Management review

10 Improvement

10.1 Nonconformity and corrective action

10.2 Continual improvement

Bibliography

Other standards that are helpful for the implementation of business continuity are:

Images   ISO/IEC 27031 - Guidelines for information and communication technology readiness for business continuity27

Images   PAS 200 - Crisis management – Guidance and good practice28

Images   PD 25666 - Guidance on exercising and testing for continuity and contingency programmes29

Images   PD 25111 - Guidance on human aspects of business continuity30

Images   ISO/IEC 24762 - Guidelines for information and communications technology disaster recovery services31

Images   ISO/PAS 22399 - Guideline for incident preparedness and operational continuity management32

Images   ISO/IEC 27001 - Information security management systems – Requirements33

Images   NIST Special Publication 800-34 Rev 1 - Contingency Planning Guide for Federal Information Systems34

Data Stored in Electronic Form

Backup strategies for data used to restore technology are varied and are driven by the RTO and the RPO needed to support the business requirements. Some organizations tier data based on its importance to the business and frequency of use. The more time-sensitive data is replicated off-site either synchronously or asynchronously to ensure its availability and its currency. Other data is backed up to tape and sent off-site once or a day or more frequently if required.

If the data needed to rebuild the technology environment is stored somewhere else besides the alternate site, then the time it takes to pack and transport that data must be included in the RTO. Factors such as how the data is stored, how far away it is, and how it will be delivered to the recovery facility will determine how much the recovery time could be increased. Delivery of off-site data to the recovery facility could delay the recovery by hours or even days. To reduce the recovery time, the data that will be used to recover any mission critical systems and applications should be stored in the recovery site whenever possible.

It is vital that the data that is stored off-site include not only the application data but also the application source code, hardware and software images for the servers and end user desktops, utility software, license keys, etc. Application data alone will not be sufficient to rebuild an application.

Remote Replication and Off-Site Journaling

Remote replication involves moving data over a network to secondary storage devices in another location. It can be an expensive solution to implement, but it will meet the needs of an application RPO that is immediate or near immediate. It can be done either synchronously or asynchronously.

Synchronous replication is when the data is written to the production environment disk and to the remote disk at the same time. Until both “writes” occur, the next process cannot begin. There are distance limitations on performing synchronous remote replication as well as network bandwidth requirements that can be extremely demanding. Synchronous replication has the potential to impact production, but is the best solution when time to recovery and data loss matter. This type of replication is commonly deployed in dual data center environments where applications are load-balanced between the two or more sites but can be used for other strategies when the currency of the data, not the actual time of the recovery is key.

Asynchronous replication occurs when the data is written to the production environment and then is queued to write to the backup environment at scheduled intervals depending on the RPO for the data. This can occur several times a day, several times an hour, or several times a minute, depending again on the need for data currency35.

Asynchronous data replication’s advantage is that it does not impact production performance as it takes place offline from the production environment and provides long-distance, remote data replication while still providing disaster recovery data protection.

Remote replication does not eliminate the need for point-in-time copies of the data. If data in the production environment becomes corrupt, the replicated data will also be corrupt. A point-in-time copy of the data is still required for restoration from this type of event36.

Backup Strategies

Most companies, no matter what strategy they employ for storing data off-site, start by performing full backups of all their data followed by periodic incremental backups. Incremental backups take copies of only the files that are new or have changed since the last full or incremental backup was taken, and then set the archive bit to “0.” The other common option is to take a differential backup. Differential backups copy only the files that are new or have changed since the last full backup and do not change the archive bit value.

If an organization wants the backup and recovery strategy to be as simple as possible, then they should only use full backups. They take more time and hard drive space to perform, but they are the most efficient in recovery. If that option is not viable, a differential backup can be restored in just two steps. The full backup of the data is restored first, then the differential backup on top of it. Remember, the differential backs up every piece of data in file that has changed since the last full backup was taken.

An incremental backup takes the most time to restore because the system must lay down the full backup first and then every incremental backup taken since the last full backup. If daily incremental backups are performed, along with only monthly full backups, and recovery is attempted on the 26th day of the month, the organization will have to perform the full backup restore first and then 26 incremental backups must be laid on top in the same order that they were taken in. This example illustrates how the backup method chosen could have a significant impact on the recovery timeline.

There are several other backup methods that the security architect should consider when planning for data integrity, confidentiality, and availability within the context of a recovery strategy.

Synthetic Full Backup

A synthetic full backup is a variation of an incremental backup. Like any other incremental backup, the actual backup process involves taking a full backup, followed by a series of incremental backups. But synthetic backups take things one step further.

What makes a synthetic backup different from an incremental backup is that the backup server actually produces full backups. It does this by combining the existing full backup with the data from the incremental backups. The end result is a full backup that is indistinguishable from a full backup that has been created in the traditional way.

The primary advantage to synthetic full backups is greatly reduced restore times. Restoring a synthetic full backup does not require the backup operator to restore multiple tape or disk sets as an incremental backup does. Synthetic full backups provide all of the advantages of a true full backup, but offer the decreased backup times and decreased bandwidth usage of an incremental backup.

Incremental-Forever Backup

Incremental-forever backups are often used by disk-to-disk-to-tape backup systems. The basic idea is that like an incremental backup, the incremental-forever backup begins by taking a full backup of the data set. After that point, only incremental backups are taken.

What makes an incremental-forever backup different from a normal incremental backup is the availability of data. Restoring an incremental backup requires the tape or disk set containing the full backup, and every subsequent backup set in order up to the point in time that you want to restore to. While this is also true for an incremental-forever backup, the backup server typically stores all of the backup sets on either a large disk array or in a tape library. It automates the restoration process so that you do not have to figure out which tape sets need to be restored. In essence, the process of restoring the incremental data becomes completely transparent and mimics the process of restoring a full backup.

Mirror Backup

A mirror backup is a straight copy of the selected folders and files at a given instant in time. That is, the destination becomes a “mirror” of the source. Any mirror operation after the first will only copy new and modified files, making the operation faster. And deleted files will be removed from the set as well. Some key disadvantages to be aware of with Mirror Backups are that they consume more space than other backup types will, they are not able to be password protected to ensure confidentiality of the data they contain, and they are not capable of tracking different versions of the files that they contain.

Disk Imaging

This type of backup is often described as a “bare metal backup” because it backs up physical disks at the volume level. In other words, a true disk image is an exact copy of an entire physical disk or disk partition.

In its simplest form a disk imaging program creates a bit identical copy of a drive which is made by dumping raw data byte by byte, sector by sector from the source disk into an image file. Disk imaging programs have the ability to interpret the data being copied and remove or compress the empty blocks on a disk which leads to much smaller image files. The majority of programs also create compressed image formats that can be mounted and explored making it possible to retrieve individual files. In addition, the creation of successive incremental or differential backups is often supported, which further reduces the demands on storage. Other techniques have been developed that allow file-level operations such as file type filtering, the ability to exclude non-essential files from the image, such as pagefile.sys, and the ability to image a drive or partition while it is currently in use.

Advantages of Image Backup Systems include allowing for rapid full system restores with the operating system across similar or different hardware platforms. Speed and simplicity is unsurpassed when working with large numbers of files. Many modern image formats can be mounted and used like any other drive making accessing backed up files very user friendly.

File Synchronization

File Synchronization is a specialized adaptation of file based backup technologies. The primary design of synchronization programs are to replicate or mirror working files in two or more locations, allowing for both sets of files to be available and used in real time. The main difference between file synchronization and backup solutions is that a backup will copy files in one direction, while synchronization copies files (or changes) in two directions. With backups there is a “source” and a “destination.” With synchronization there are two sources. For example, when a group of files are set to be synchronized between two systems, files which are changed on either system will be reflected on the other.

Synchronization that replicates changes in both locations is called two-way synchronization. Synchronization that only replicates the changes from one location to the second is called one-way synchronization. One-way synchronization differs from traditional backups when the propagation of deletions or renames is performed, because backups do not generally delete files, and a renamed file is usually copied again.

One-way synchronization can be used as a way to both backup and synchronize computers. A one-way synchronization may be made to a portable hard drive. The hard drive may then be used as a source to sync with another computer and new files will be transferred from the portable drive to the pc. New files on the second pc may also be backed up to the portable drive and then can be transferred to the first pc. Thus both computers will remain in sync, and the data on the drive will remain as a backup of both systems. Similar setups are often used by online storage services which may be seen as being both an online backup as well as an online synchronization solution.

Some key advantages of file synchronization systems are that files synchronized to online sources can often be easily accessed from any computer or mobile device such as a smart phone or tablet, and some programs combine real-time synchronization with a versioning system that allows for easy collaboration between individuals working on the same file, as well as usually providing for difference comparisons and merging capabilities.

Managed Backup Services

A remote, online, or managed backup service, commonly referred to as a “cloud backup” solution, is a service that provides users with a system for the backup and storage of computer files. Online backup providers are companies that provide this type of service to end users (or clients). Such backup services are considered a form of cloud computing. Online backup systems are built around a client software program that runs on a schedule, typically once a day, and usually at night while computers are not in use. This program collects, compresses, encrypts, and transfers the data to the remote backup service provider’s servers or off-site hardware.

These solutions are service-based, meaning that in order to provide for the assurance, guarantee, or validation that what was backed up is recoverable whenever it is required, data stored in the service provider’s cloud must undergo regular integrity validation to ensure its recoverability. Cloud services need to provide for granularity when it comes to Recovery Time Objectives (RTO’s), as one size does not fit all either for the customers or the applications within a customer’s environment. The customer should never have to manage the back end storage repositories in order to back up and recover data, this should be the responsibility of the service provider. Cloud backup needs to be an active process where data is collected from systems that store the original copy. This means that cloud backup will not require data to be copied into a specific appliance from where data is collected before being transmitted to and stored in the service provider’s data center(s).

Cloud backup solutions offer ubiquitous access, meaning that they utilize standard networking protocols, primarily, but not exclusively IP based, to transfer data between the customer and the service provider over secured connections. Vaults or repositories need to be always available to restore data to any location connected to the service provider’s cloud via private or public networks, as needed during a Disaster Recovery event or test.

Scalability and elasticity enable flexible allocation of storage capacity to customers without limit. Storage is allocated on demand and also de-allocated when customers delete backup sets as they age. The service provider can then release and reallocate that same capacity to a different customer in an automated fashion.

Metering by use allows the security architect to align the value of data with the cost of protecting it. It is procured on a per-gigabyte per month basis. Prices tend to vary based on the age of data, type of data (email, databases, files etc.), volume, number of backup copies and RTOs.

Data mobility/portability prevents service provider lock-in and allows customers to move their data from one service provider to another, or entirely back into a dedicated Private Cloud or a Hybrid Cloud, as needed. Security in the cloud is a critical factor for the security architect to consider. One customer can never have access to another’s data. Additionally, even service providers must not be able to access their customer’s data without the customer’s permission. The security architect needs to ensure that security issues are addressed up front with the vendor in order to make sure that all risks and threats are identified and documented to the fullest extent possible.

Some advantages of managed backup services over traditional backup methods are that they are storing the backups of system data in a different location from the original data. Traditional backup requires manually taking the backup media offsite. Remote backup does not require user intervention. The user does not have to change tapes, label CDs or perform other manual steps to ensure the success of the backup. Remote backup services can be set up to work continuously, backing up files as they are changed.

Some disadvantages of managed backup services over traditional backup methods are that depending on the available network bandwidth, the restoration of data can be slow. It is possible that a remote backup service provider could go out of business or be purchased, which may affect the accessibility of one’s data or the cost to continue using the service. While the data is encrypted during transit through the cloud, it is impossible for the customer to know exactly where their data is transiting, and what country, or countries, it may pass through between the customer’s network and the service provider’s end point for data storage. As a result, the confidentiality and integrity of the customer’s data could be compromised without their knowledge. Further, privacy laws and data regulatory requirements differ by country, and depending on those laws, the customers data could potentially be exposed to regulatory compliance requirements that the customer is not prepared to comply with for a variety of reasons. The security architect must take these issues into account as they examine the various managed backup services available as part of their BCP/DRP architecture.

The RTO for a business process or for an application will determine the recovery strategy for the process or application. The more time that can elapse before the recovery needs to occur, the more recovery options are available. The more time sensitive an application or function is, the fewer options the security architect will have in selecting a recovery strategy. Also, the plan will be more detailed and require more testing and training to successfully implement.

Selecting a Recovery Strategy for Technology

Selecting a recovery strategy depends on how much downtime is acceptable before the technology recovery must be complete. The security architect should examine recovery strategies available to them, and based on weighing the recovery requirements, select the best strategy for the technology environment. The most common strategies are as follows:

Images   Dual Data Center - This strategy is employed for applications that cannot accept any downtime without unacceptably impacting the business. The applications are split between two geographically dispersed data centers and either load-balanced between the two centers or hot-swapped between them. The surviving data center must have enough capacity to carry the full production load in either case.

Images   Internal Hot Site - An internal hot site is standby ready with all the technology and equipment necessary to run the applications to be recovered there. The security architect will be able to effectively restart any application in hot site recovery without having to perform a bare metal recovery of servers. Because this is an internal solution, often the business will run non-time-sensitive processes there, such as development or test environments that can be pushed aside for recovery of production when needed. When employing this strategy, it is important that the two environments be kept as close to identical as possible to avoid problems with OS levels, hardware differences, capacity differences, etc., that could prevent or delay recovery.

Images   External Hot Site - This design has equipment on the floor waiting for recovery, but the environment must be rebuilt for the recovery. These are services contracted through a recovery service provider. Again, it is important that the two environments be kept as close to identical as possible to avoid problems with OS levels, hardware differences, capacity differences, etc., that could prevent or delay recovery. Hot site vendors tend to have the most commonly used hardware and software products to attract the largest number of customers to utilize their sites. Unique equipment or software would generally need to be provided by the organization either at the time of disaster or stored there prior to the disaster.

Images   Warm Site - A warm site is a leased or rented facility that is usually partially configured with some equipment, but not the actual computers. It will generally have all the cooling, cabling, and networks in place to accommodate the recovery, but the actual servers, mainframe, and other equipment are delivered to the site at the time of the disaster recovery event.

Images   Cold Site - A cold site is a shell or empty data center space with no technology on the floor. All technology must be purchased or acquired at the time of disaster recovery event.

Images   Reciprocal Agreement - In this strategy, the organization signs an agreement with a similar business operation to provide backup capabilities to each other in the event either experiences a disaster.

Images   Mobile Unit - A mobile unit is typically a contract with a vendor to provide a mobile trailer at the time of disaster, which contains the specified equipment necessary to support recovery.

Images   Outsourcing / Cloud - The technology environment is outsourced to a vendor who provides the disaster recovery plan for the applications that are deemed critical to the business.

Each of these recovery strategies has advantages and disadvantages.

Creating a picture of what the recovery strategy looks like for the business is important for the security architect, because it allows the business to have an understanding of what will happen, and what to expect when a disaster event occurs. The documentation of the disaster recovery strategy will have many elements, such as call trees, password and user accounts lists, system baselines, and procedure documents. In addition, there are a few things that are often overlooked, or just not thought about prior to the disaster event that the security architect should plan for.

The first item is to ensure that user awareness of the plan is up to date, especially after any changes have been adopted due to testing or compliance activities. A great way for the security architect to ensure that awareness among all users is adequate is to publish quarterly high level summaries of the plan by user role. These high level summaries can take a variety of forms from short overview documents that list the main activities and responsibilities for the user group during a disaster, to a basic work flow or picture of what the user will need to do if a disaster event occurs. Figure 5.3 shows what a sample picture might look like for standard information workers in most businesses today with regards to network access to data in the event of a disaster event that rendered their primary office space unavailable for a period of time.

The second item is to ensure that the IP networking stack at an alternate site is ready to accept incoming traffic on demand. It needs to be sized appropriately so there are no bottlenecks when real traffic starts flowing through it. The security architect needs to work with the business’s service providers to ensure that all of the public-facing websites, virtual private network gateways, Web load balancers, traffic distribution engines, firewalls, intranet access points and other critical access points are available via or at the alternate location on demand.

The third item is to avoid a single point of failure within the disaster recovery plan. Specifically, the security architect needs to ensure that the plan does not rest solely with one person for execution, or that it relies on knowledge that is not documented centrally and distributed to multiple members of the DR team. If the plan relies on a single person, or piece of equipment, or knowledge item, and there is no redundancy built into the plan for that element, then the plan is very likely to fail under stress.

Dual Data Center

Advantages

Disadvantages

Little or no downtime

Most expensive option

Ease of maintenance

Requires redundant hardware, networks, staffing

No recovery required

Distance limitations

Internal or External Hot Site

Advantages

Disadvantages

Allows recovery to be tested

Expensive; * Terms of contract could limit usage (external site only)

Highly available

Hardware and software compatibility issues in external sites

Site can be operational within hours

Communication costs to duplicate data can be high

Warm and Cold Site

Advantages

Disadvantages

Less expensive

Not immediately available; Once up and running, delays could occur due to equipment, software, or staffing issues or mismatches

Available for longer recoveries

Not testable

Reciprocal Agreement

Advantages

Disadvantages

No cost

Technology upgrades, obsolescence, or business growth

Viable for small business operations with limited technology

Security and access by partner users; typically located in same geography as disaster, so could be of little to no practical use in the event of a large scale disaster

Mobile Unit

Advantages

Disadvantages

Self-contained unit with technology and network

Difficult and expensive to test unless you own it

Transportable to any site

Travel time to bring unit where it is needed; Access to site may be hampered or unavailable due to disaster

Outsourcing / Cloud

Advantages

Disadvantages

Transfer the ownership of recovery to vendor

Cost

May provide same recovery as dual data center but at less cost

No ownership or control over recovery program except contractually

No recovery plan to maintain or test

Security

Images

Figure 5.3 - High level disaster recovery view of information worker activity during a disaster event

The fourth item is to keep the plan as simple as possible. This can be achieved primarily through the use of “off the shelf” or “out of the box” solutions, as opposed to complex and highly customized solutions. The learning curve to become proficient with an out of the box solution is very small when compared to learning to master a highly customized solution. Under stress, simple systems behave better, are less prone to failures, and are easier to troubleshoot.

Cost–Benefit Analysis

Each of the foregoing strategies can be considered for the business and technology recovery. Those that are recommended need to have a Cost–Benefit Analysis (CBA) performed to determine if the costs of the strategy being recommended fits within the amount of risk or business loss the business is trying to avoid. The company would not spend $1,000,000 a year on a recovery strategy to protect $100,000 of profit. Every business does not need a dual data center recovery strategy. The strategy selected must fit the business need. The following image from NIST helps visualize this tradeoff:37

Images

The cost of implementing the recovery strategy recommended needs to include the initial costs associated with building out the strategy as well as ongoing costs to maintain the recovery solution, and where applicable, the cost of periodic testing of the solution to ensure it remains viable.

Implementing Recovery Strategies

Once the strategy has been agreed to and funded, the next step is to implement the various strategies approved. This may involve negotiating with vendors to provide recovery services for business or technology, doing site surveys of existing sites to determine excess capacity, wiring conference rooms or cafeterias to support business functions, buying recovery technology, installing remote replication software, installing networks for voice and data recovery, assigning alternate site seats to the various business areas, and the like.

The implementation phase is a project unto itself, perhaps multiple projects, depending on the complexity of the environment and the recovery strategies selected.

Documenting the Plan

Once recovery strategies have been developed and implemented for each area, the next step is to document the plan itself. The plan includes plan activation procedures, the recovery strategies to be used, how recovery efforts will be managed, how human resource issues will be handled, how recovery costs will be documented and paid for, how recovery communications to internal and external stakeholders will be handled, and detailed action plans for each team and each team member. The plan then needs to be distributed to everyone who has a role.

The documentation for recovery of the technology environment needs to be detailed enough that a person with a similar skill set who has never executed the procedures before could use them to perform the recovery. Documentation tends to be a task that no one really likes to do; however, there is no guarantee that the people who perform this function in the production environment or the person who restored the infrastructure and application at the last test is going to be available at the time of the disaster. In addition, disasters tend to be chaotic times where many things are happening at once. Without the proper documentation, a practiced recovery strategy can fall apart and add to the chaos. Restoring an application can be challenging and restoring an entire data center just destroyed by a tornado can be overwhelming, if not impossible, without good documentation.

The documentation needs to be stored at the recovery facility, and every time the recovery is tested, the documentation should be used by the recovery participants and updated as needed. Once the level of confidence in the documentation is high, the security architect should have someone who has never performed the procedure attempt it with an expert looking over their shoulder. It may slightly delay the recovery time at that particular test, but once complete, confidence in the documentation will be strong.

The Human Factor

One common factor left out of many plans is human resource issues. Disasters are human events, and it is important that the plan document the responsibility of the firm to the employees participating in the recovery. Companies need to recognize that to respond to the company’s needs in a disaster situation, it must also recognize the hardships placed on the families of its response teams. To be able to give the best to the company at the time when it is needed most, employees need to have a level of comfort that their family members are safe and the employee’s absence during the recovery effort will not place undue hardship on them.

Logistics

The plan needs to document the logistics of the recovery, not just the technical documentation for recovery of the hardware and applications. The plan needs to contain answers to the following questions:

  1. How the disaster will be declared and who has the authority to declare it?

  2. How recovery team members will be contacted and who will contact them?

  3. How recovery team members are to travel to the alternate site, and who will make any required reservations and pay for those costs?

  4. Where documentation is stored and how to get it?

  5. How off-site backups will be retrieved, who will do it, and how long it will take?

  6. What are the address, phone number, and directions to the alternate site?

  7. How necessary supplies will be provided and how more can be requested?

  8. What is the command center location and phone number?

  9. How problems will be reported and managed?

Plan Maintenance Strategies

As with any documentation, version control is important, particularly with detailed technical procedures. The use of version control numbers on the plan helps to ensure that everyone is using the current version of the plan documentation. The plan needs to be published to everyone who has a role and also needs to be stored in a secure off-site location that not only survives the disaster but is accessible immediately following it.

It is important that the plan be kept up to date as the business and technology environments of your company continue to change and adapt. Tying plan updates to your change management process is critical to keeping pace with significant changes in technology. The BCP must be reviewed and updated at least annually, and more often if significant business changes occur. Plan updates also frequently occur following tests of the plan, if issues or action items from the test require plan documentation changes.

Once the plan has been completed and the recovery strategies are fully implemented, it is important to test all parts of the plan to validate that it would work in a real event. The purpose of testing is to validate the readiness to recover from a real event. If we knew that it all worked, we would not need to test it in the first place. Test to find out what does not work, so that it can be fixed before it happens for real. No test is a failure as long as it provides opportunities to better the recovery process, so that if it happened for real, the organization is more likely to recover.

The first rule of conducting tests of the recovery plans is that no matter what type of test you are conducting, it is important to protect the production environment.

There are many different types of exercises that the security architect can conduct. Some will take minutes, and others hours or days. The amount of exercise planning needed is entirely dependent on the type of exercise, length of the exercise, and the scope of the exercise planned to be conducted. The most common types of exercises for technology recovery are walkthrough exercises, simulated or actual exercises, and compact exercises.

In a walkthrough or tabletop exercise, the team that would need to execute the plan holds a meeting to review the plan. When the organization has a new plan, the best type of tabletop exercise to do is a walkthrough of the actual plan document with everyone who has a role in the plan. Even the planning team is unlikely to read the entire document, and walking through the plan helps to ensure that everyone knows the whole story and everyone’s role. Walking through the plan with the team will help identify gaps in the plan so that they can be addressed.

Once the security architect has conducted that type of walkthrough, the scenario-based tabletop exercises can begin. In these exercises, the security architect will gather the team in a meeting and pretend that something has happened, and the team members are supposed to respond as if it is a real event. The security architect could pretend that there is a power outage, and, based on what is backed up by alternate power sources such as generators and what is not, the team would discuss how the technology or business would be impacted and how they would exercise the portions of the plan to address that scenario.

Tabletop exercises are used to validate the plan within an actual scenario without having to actually execute the recovery procedures. The security architect will “talk through” with the team what they would do; they will not actually do it. These types of exercises are especially helpful in working through the decision processes that will have to be tackled by the leadership team when faced with an event and by other teams to talk through recovery options based on the scenario being presented for the exercise.

A simulated or actual exercise tests the actual recovery in the alternate site. The difference between a simulated and an actual exercise is that a simulated exercise operates completely independently of the production environment, whereas in an actual exercise the production environment is “moved” to the alternate site as is done with a dual data center strategy.

The purpose of this type of exercise is to validate alternate site readiness. The security architect should run this exercise as closely as possible to how it would happen if it were happening for real. Clearly, because exercises are planned events, the security architect will have an opportunity to reduce the actual timeline by pre-staging certain things that they could not do if this were an unplanned event, such as pulling backup tapes from off-site storage and having it delivered to the alternate site for use on the day of the exercise. What the security architect should not do as part of the planning is plan for success. Remember, the reason to test is to find out what does not work so that it can be fixed before it happens for real.

The final exercise is a compact exercise. This is where the security architect will begin with a call to the recovery team, assuming a scenario, and have them respond as if this were a real event and continue right through an actual exercise in the alternate site. These are sometimes done as surprise exercises, with very few people knowing in advance when they are going to happen.

After every exercise the security architect conducts, the exercise results need to be published and action items developed to address the issues uncovered by the exercise. Action items should be tracked until they have been resolved and, where appropriate, the plan updated. It is very unfortunate when an organization faces the same issue in subsequent tests simply because someone did not update the plan.

Bringing It All Together – A Sample “Walk Through” of a DR Plan

It can be very daunting to create a Disaster Recovery Plan for a business, no matter how many times you may have had to do it already, and regardless of the size of the business in question. Most security architects will only have to create some elements of a DR plan during their careers, and perhaps update an existing plan that was created by a predecessor, or colleague. It is rare that the security architect will have to take on the entire task of creating a plan from scratch by themselves, and even rarer if they have to do it more than once or twice during their careers. As a result, most security architects do not have that much experience creating DR plans, although they will have a lot of experience managing the business through them. The following outline offers a detailed guide illustrating the required steps that the security architect will need to ensure are carried out during each phase of the DR planning process to guarantee that the business has a complete and accurate DR plan to operate with should a disaster occur.

Step by Step Guide for Disaster Recovery Planning for Security Architects

I. Information Gathering

Step One - Organize the Project:

Images   Identify and convene planning team and sub-teams as appropriate.

Images   At the business and/or business unit level set:

Images   Scope - the area covered by the disaster recovery plan, and objectives - what is being worked toward and the course of action that the business intends to follow.

Images   Assumptions - what is being taken for granted or accepted as true without proof.

Images   Set project timetable.

Images   Draft project plan, including assignment of task responsibilities.

Images   Obtain senior management approval of scope, assumptions and project plan.

Forms that may be useful in organizing Step One:

  1. A Project Plan template.

Step Two – Conduct Business Impact Analysis (BIA)

This step would normally be performed by the security architect in conjunction with business unit managers. In order to complete the business impact analysis, perform the following steps:

Images   Identify functions, processes and systems.

Images   Interview information systems support personnel.

Images   Interview business unit personnel.

Images   Analyze results to determine critical systems, applications and business processes.

Images   Prepare impact analysis of interruption on critical systems.

Forms that may be useful in organizing Step Two:

  1. A Business Impact Analysis template.

  2. A Critical System Ranking form.

Step Three – Conduct Risk Assessment:

The planning team will want to consult with technical and security personnel as appropriate to complete this step. The risk assessment will assist in determining the probability of a critical system becoming severely disrupted and documenting the acceptability of these risks to the business.

For each critical system, application and process as identified in Step 2 using the Critical System Ranking form:

Images   Review physical security.

Images   Review backup systems.

Images   Review data security.

Images   Review policies on personnel termination and transfer.

Images   Identify systems supporting mission critical functions.

Images   Identify vulnerabilities.

Images   Assess probability of system failure or disruption.

Images   Prepare risk and security analysis.

Forms that may be useful in organizing Step Three:

  1. Security Documentation template.

  2. Vulnerability Assessment template.

Step Four - Develop Strategic Outline for Recovery:

Assemble groups as Appropriate for:

Images   Hardware and operating systems.

Images   Communications.

Images   Applications.

Images   Facilities.

* Any / all other critical functions and business processes as identified in the Business Impact Analysis.

For each system/process above quantify the following processing requirements:

Images   Light, normal and heavy processing days.

Images   Transaction volumes.

Images   Dollar volume (if any)

Images   Estimated processing time.

Images   Allowable delay (days, hours, minutes, etc.)

Detail all the steps in the workflow for each critical business function (e.g., for payroll processing each step that must be complete and the order in which to complete them.):

Images   Identify systems and applications.

Images   Component name and technical id (if any)

Images   Type (online, batch process, script)

Images   Frequency.

Images   Run time.

Images   Allowable delay (days, hours, minutes, etc.)

Images   Identify vital records (e.g., procedures, Intellectual Property, etc.)

Images   Name and description.

Images   Type (e.g., backup, original, master, history, etc.)

Images   Where they are stored.

Images   Source of item or record.

Can the record be easily replaced from another source (e.g., reference materials):

Images   Backup type(s)

Images   Backup generation frequency.

Images   Number of backup generations available onsite.

Images   Number of backup generations available off-site.

Images   Location of backups.

Images   Media type.

Images   Retention period.

Images   Rotation cycle.

Who is Authorized to Retrieve the Backups?

Identify what would be the minimum requirements/replacement needs to perform the critical function if a severe disruption occurred.

Images   Type (e.g. server hardware, software, etc.)

Images   Item name and description.

Images   Quantity required.

Images   Location of inventory, alternative, or offsite storage.

Images   Vendor/supplier.

  1. Identify if alternate methods of processing either exist or could be developed, quantifying where possible, impact on processing (Include manual processes.)

  2. Identify person(s) who supports the system or application.

  3. Identify primary person to contact if system or application cannot function as normal.

  4. Identify secondary person to contact if system or application cannot function as normal.

  5. Identify all vendors associated with the system or application.

  6. Document business unit strategy during recovery (conceptually how will the business unit function?)

  7. Quantify resources required for recovery, by time frame (e.g., 1 pc per day, 3 people per hour, etc.)

Develop and document recovery strategy, including:

Images   Priorities for recovering system/function components.

Images   Recovery schedule.

Forms that may be useful in organizing Step Four:

  1. A Group Assignments spreadsheet.

  2. A Critical System Processing Requirements for Recovery spreadsheet.

Step Five – Review Onsite and Offsite Backup and Recovery Procedures

Images   Review current records (System Instructions, documented processes, etc.) requiring protection.

Images   Review current offsite storage facility or arrange for one.

Images   Review backup and offsite storage policy or create one.

Step Six – Select Alternate Facility:

Alternate Site: A location, other than the normal facility, used to process data and/or conduct critical business functions in the event of a disaster.

Determine Resource Requirements:

Images   Assess platform uniqueness of business systems (e.g., Apple, Oracle database, AIX, etc.)

Images   Identify alternative facilities.

Images   Review cost/benefit.

Images   Evaluate and make recommendation.

Images   Seek approval.

Images   Make selection.

II. Plan Development and Testing

Step Seven – Develop Recovery Plan:

The steps for developing the Recovery Plan are listed below in outline form to demonstrate how a security architect may choose to organize their Disaster Recovery Plan.

Objective:

The objective may have been documented in the Information Gathering Step 1 Plan Organization.

Plan Assumptions:

Images   All assumptions that impact the plan will need to be listed.

Criteria for Invoking the Plan:

Images   All criteria that must be met or satisfied in order to all for the invocation of the plan need to be listed.

Document emergency response procedures to occur during and after an emergency (i.e. ensure evacuation of all individuals, call the fire department, after the emergency check the building before allowing individuals to return)

Images   Document procedures for assessment and declaring a state of emergency.

Images   Document notification procedures for alerting managers.

Images   Document notification procedures for alerting vendors.

Images   Document notification procedures for alerting staff and notifying of alternate work procedures or locations.

Roles Responsibilities and Authority:

Images   Identify personnel.

Images   Recovery team description and charge.

Images   Recovery team staffing.

Transportation schedules for media and teams

Procedures for operating in contingency mode:

Images   Process descriptions.

Images   Minimum processing requirements.

Images   Determine categories for vital records.

Images   Identify location of vital records.

Images   Identify forms requirements.

Images   Document critical forms.

Images   Establish equipment descriptions.

Images   Document equipment - in the recovery site.

Images   Document equipment - in the business.

Software Descriptions:

Images   Software used in recovery

Images   Software used in production

Produce logical drawings of communication and data networks in the business.

Produce logical drawings of communication and data networks during recovery.

Vendor List:

Images   Review vendor restrictions

Images   Miscellaneous inventory

Images   Communication needs - production

Images   Communication needs - in the recovery site

Images   Resource plan for operating in contingency mode

Images   Criteria for returning to normal operating mode

Images   Procedures for returning to normal operating mode

Images   Procedures for recovering lost or damaged data

Testing and Training:

Images   Document Testing Dates.

Images   Complete disaster/disruption scenarios.

Images   Develop action plans for each scenario.

Images   Sample Testing Diagram.

Plan Maintenance:

Images   Document Maintenance Review Schedule (yearly, quarterly, etc.)

Images   Maintenance Review action plans.

Images   Maintenance Review recovery teams.

Images   Maintenance Review team activities.

Images   Maintenance Review/revise tasks.

Images   Maintenance Review/revise documentation.

Appendices for Inclusion:

Images   Inventory and report forms.

Images   Maintenance forms.

Images   Hardware lists and serial numbers.

Images   Software lists and license numbers.

Images   Contact list for vendors.

Images   Contact list for staff with home and work numbers.

Images   Contact list for other interfacing departments or business units.

Images   Network schematic diagrams.

Images   Equipment room floor grid diagrams.

Images   Contract and maintenance agreements.

Images   Special operating instructions for sensitive equipment.

Images   Cellular telephone inventory and agreements.

Step Eight - Test the Plan:

Images   Develop test strategy.

Images   Develop test plans.

Images   Conduct tests.

Images   Modify the plan as necessary.

III. Ongoing Maintenance

Step Nine - Maintain the Plan:

The security architect will be responsible for overseeing this.

Images   Review changes in the environment, technology, and procedures.

Images   Develop maintenance triggers and procedures.

Images   Submit changes for systems development procedures.

Images   Modify change management procedures.

Images   Produce plan updates and distribute.

Step Ten – Perform Periodic Audit:

Establish periodic review and update procedures

Summary   Images

In summary, the security architect should have an understanding of the business continuity and disaster recovery domain to assist the organization in being prepared to recover in the event of a disaster.

The way we recover will continue to evolve. Technology recovery is moving from traditional approaches centered on a recovery strategy to a continuous operation strategy where the business never really has to “recover”, but rather, simply continues to operate as if nothing, or almost nothing happened due to the fact that technology is in place that simply picks up from where the primary failed, allowing the alternate recovery site, or system, or data elements to be available in near real, or real time, depending on the needs and strategy of the business.

Cloud computing continues to change the face of the technology environment and as a result, the recovery environment.

The goal - to continue the business or mission of the organization - remains the same.

Images   References

Eric Waxvik, Risk, response, and recovery, in Official (ISC)2 Guide to the SSCP CBK, New York: Auerbach Publications, 2007, p. 212.

Images   Review Questions

1. Which phrase BEST defines a business continuity/disaster recovery plan?

  1. A set of plans for preventing a disaster.

  2. An approved set of preparations and sufficient procedures for responding to a disaster.

  3. A set of preparations and procedures for responding to a disaster without management approval.

  4. The adequate preparations and procedures for the continuation of all business functions.

2. Which of the following statements BEST describes the extent to which an organization should address business continuity or disaster recovery planning?

  1. Continuity planning is a significant corporate issue and should include all parts or functions of the company.

  2. Continuity planning is a significant technology issue, and the recovery of technology should be its primary focus.

  3. Continuity planning is required only where there is complexity in voice and data communications.

  4. Continuity planning is a significant management issue and should include the primary functions specified by management.

3. Risk analysis is performed to identify

  1. the impacts of a threat to the business operations.

  2. the exposures to loss of the organization.

  3. the impacts of a risk on the company.

  4. the way to eliminate threats.

4. During the risk analysis phase of the planning, which of the following actions could manage threats or mitigate the effects of an event?

  1. Modifying the exercise scenario.

  2. Developing recovery procedures.

  3. Increasing reliance on key individuals.

  4. Implementing procedural controls.

5. The reason to implement additional controls or safeguards is to

  1. deter or remove the risk.

  2. remove the risk and eliminate the threat.

  3. reduce the impact of the threat.

  4. identify the risk and the threat.

6. Which of the following statements BEST describe business impact analysis?

  1. Risk analysis and business impact analysis are two different terms describing the same project effort.

  2. A business impact analysis calculates the probability of disruptions to the organization.

  3. A business impact analysis is critical to development of a business continuity plan.

  4. A business impact analysis establishes the effect of disruptions on the organization.

7. The term disaster recovery commonly refers to:

  1. The recovery of the business operations.

  2. The recovery of the technology environment.

  3. The recovery of the manufacturing environment.

  4. The recovery of the business and technology environments.

8. Which of the following terms BEST describe the effort to determine the consequences of disruptions that could result from a disaster?

  1. Business impact analysis

  2. Risk analysis

  3. Risk assessment

  4. Project problem definition

9. The BEST advantage of using a cold site as a recovery option is that it

  1. is a less expensive recovery option.

  2. can be configured and operationalized for any business function.

  3. is preconfigured for communications and can be customized for business functions.

  4. is the most available option for testing server recovery and communications restorations.

10. The term RTO means:

  1. Recovery time for operations

  2. Return to order

  3. Resumption time order

  4. Recovery time objective

11. If a company wants the fastest time to restore from tape backup, it should perform their backup using the following method:

  1. Full backup

  2. Incremental backup

  3. Partial backup

  4. Differential backup

12. One of the advantages of a hot site recovery solution is

  1. Lowered expense

  2. High availability

  3. No downtime

  4. No maintenance required

13. Which of the following methods is not acceptable for exercising the business continuity plan?

  1. Tabletop exercise

  2. Call exercise

  3. Simulated exercise

  4. Halting a production application or function

14. Which of the following is the primary desired result of any well-planned business continuity exercise?

  1. Identification of plan strengths and weaknesses.

  2. Satisfaction of management requirements.

  3. Compliance with auditor’s requirements.

  4. Maintenance of shareholder confidence.

15. A business continuity plan should be updated and maintained

  1. immediately following an exercise.

  2. following a major change in personnel.

  3. after installing new software.

  4. on an ongoing basis.

16. The primary reason to build a business continuity and disaster recovery plan is

  1. to continue the business.

  2. to restore the data center.

  3. to meet regulatory environments.

  4. because the customers expect it.

17. A company would chose to use synchronous remote replication for its data recovery strategy if

  1. it wanted to replace point-in-time backups.

  2. it wanted to minimize the amount of time taken to recover.

  3. time to recovery and data loss are important to the business.

  4. distance limitations existed.

18. One of the reasons asynchronous replication differs from synchronous replication is

  1. because it can impact production.

  2. because it can be done over greater distances.

  3. because it involves less loss of data.

  4. because it improves recovery time.

19. The purpose of doing a cost–benefit analysis on the different recovery strategies is

  1. to make certain the cost of protection does not exceed the cost of the risk it is protecting.

  2. to determine the cost of implementing the recovery strategy.

  3. to determine that the strategy will be effective.

  4. to analyze the cost of the different strategies.

 

1   The main website for the American Society for Industrial Security (ASIS International) can be found here: http://www.asisonline.org/

2   It is important for the security architect to be aware of emerging trends in their areas of responsibility as well as existing standards. The integration of cloud computing solutions within the security architecture of many organizations is an ongoing process that requires attention and consideration. There are many factors that the security architect will need to consider as they ensure that cloud based solutions and platforms are fully integrated and properly secured as part of the organization security architecture. The need to address Service Level Agreements (SLA’s) and hosting arrangements with regards to BCP/DRP is just one small area for the security architect to be concerned with. Several good starting points can be found here:

  1. The Cloud Security Alliance main web site: https://cloudsecurityalliance.org/

  2. European Network and Information Security Agency (ENISA) Cloud Computing Information Assurance Framework, 11/20/2009. : http://www.enisa.europa.eu/activities/risk-management/files/deliverables/cloud-computing-information-assurance-framework

  3. European Network and Information Security Agency (ENISA) Cloud Computing Risk Assessment, 11/20/2009. : http://www.enisa.europa.eu/activities/risk-management/files/deliverables/cloud-computing-risk-assessment

3   See the following for the NIST Special Publication 800-30 Revision 1 Guide for Conducting Risk Assessments: http://csrc.nist.gov/publications/nistpubs/800-30-rev1/sp800_30_r1.pdf (page 21)

4    Table 2 is found in the Centers for Disease Control and Prevention (CDC) Draft Risk Assessment Report, submitted 2007, on page 19. The full draft report can be found here: http://csrc.nist.gov/groups/SMA/fasp/areas.html

The report is listed under the Incident Response Capability section, and is titled “Business Continuity Plan Functional Test After-Action Report - (CDC)”

5    See the following for a simple template that is available for free from Microsoft’s Office templates download library: http://office.microsoft.com/en-us/templates/risk-assessment-and-financial-impact-model-TC001184173.aspx

This template is designed to assist executives, risk management, and line management in analyzing the potential risks facing an organization, as well as mitigating controls that can be used to manage these risks. It allows you to weight the financial impact and probability of specified risks versus the cost of controls, in order to facilitate a cost/benefit decision analysis.

6    See the following for the Canadian Institute of Chartered Accountants (CICA) website: http://www.cica.ca/index.aspx

7    See the following for the initial announcement of the draft publication of KonTrag:

Remarks by Dr. Gerhard Cromme

Chairman of the Goverment Commission German Corporate Governance Code on the publication of the draft German Corporate Governance Code December 18, 2001 in Düsseldorf

http://www.corporate-governance-code.de/eng/news/rede-crommes.html

8    See the following for the Committee of Sponsoring Organizations (COSO) website: http://www.coso.org/

9    See the following for the ISO 31000 Standard: http://www.iso.org/iso/home/standards/iso31000.htm

10    See the following for the ISO/IEC 31010 Standard: http://www.iso.org/iso/catalogue_detail?csnumber=51073

11    See the following for ISO Guide 73: http://www.pqm-online.com/assets/files/standards/iso_iec_guide_73-2009.pdf

12    See the following for the Global Risks 2012 Insight report prepared by the World Economic Forum: http://www3.weforum.org/docs/WEF_GlobalRisks_Report_2012.pdf

The Global Risks Insight report is a yearly series that is published by the World Economic Forum. The WEF is an independent international organization committed to improving the state of the world b y engaging business, political, academic and other leaders of society to shape global, regional and industry agendas. See their main web site here: http://www.weforum.org/

13    There are many sources for regional based analysis and information on risks and threats, both by country and internationally. A small sampling of these would include the United Nations and all of its various working bodies and committees, the World Economic Forum, the Organization for Economic Co-Operation and Development (OECD), The World Bank, and the International Monetary Fund (IMF), all of which engage in risk and threat analysis by region globally and publish their findings in a variety of formats.

14    See the following for specific hazards information by category from the USGS: http://www.usgs.gov/natural_hazards/

In 2010, the USGS realigned its organizational structure around the missions identified in the USGS Science Strategy. The Natural Hazards Mission Area includes six science programs: Coastal & Marine Geology, Earthquake Hazards, Geomagnetism, Global Seismographic Network, Landslide Hazards, and Volcano Hazards. Through these programs, the USGS provides alerts and warnings of geologic hazards and supports the warning responsibilities of the National Oceanic and Atmospheric Administration (NOAA) for geomagnetic storms and tsunamis. The Coastal and Marine Geology Program supports all the missions of the USGS, characterizing and assessing coastal and marine processes, conditions, change and vulnerability.

The Natural Hazards Mission Area is responsible for coordinating USGS response following disasters and overseeing the bureau’s emergency management activities. The mission area coordinates long-term planning across the full USGS hazards science portfolio, including activities funded through many other programs across the bureau, including floods, hurricanes and severe storms, and wildfires.

15    See the following for information on the UNISDR: http://www.eird.org/index-eng.htm

16    See the following for the Hyogo Framework for Action document: http://www.eird.org/mah/hyogo-framework-for-action-english.pdf

17    The International Charter aims at providing a unified system of space data acquisition and delivery to those affected by natural or man-made disasters through Authorized Users. Each member agency has committed resources to support the provisions of the Charter and thus is helping to mitigate the effects of disasters on human life and property. The International Charter was declared formally operational on November 1, 2000. See the following for the text of the Charter On Cooperation To Achieve The Coordinated Use Of Space Facilities In The Event Of Natural Or Technological Disasters Rev.3 (25/4/2000): http://www.disasterscharter.org/web/charter/charter

For a list of organizations that currently support the Charter, see the following: http://www.disasterscharter.org/web/charter/members

For an interactive mashup of categorized disasters from 2000 through the present, using the Charter Geographic Tool online, see the following: http://engine.mapshup.info/charterng/

18    The PREVIEW program of the European Commission is chartered with providing Geo-information services for risk management on a European level. PREVIEW is an EC-co funded research project looking for new techniques to better protect European citizens against environmental risks and to reduce their consequences.

See the following for more information on PREVIEW: http://www.preview-risk.com/site/FO/scripts/myFO_accueil.php?lang=EN&flash=1

19    See the following for the full abstract of this paper: http://www.nat-hazards-earth-syst-sci.net/6/779/2006/nhess-6-779-2006.pdf

The citation for the paper is as follows: Nat. Hazards Earth Syst. Sci., 6, 779–802, 2006

www.nat-hazards-earth-syst-sci.net/6/779/2006/

20    See the following for the full briefing document: http://www.preventionweb.net/files/section/230_KabulDRRPresentation.pdf

The paper was prepared and presented by Arch./ Urban Designer Wahid A Ahad Technical Deputy Mayor Of Kabul, June 2010.

21    See the following for more information on the Centre for Risk Studies Catastrophe Risk Management for Natural and Man-Made Perils research programs: http://www.risk.jbs.cam.ac.uk/research/programmes/catastropherisk.html

22    See the following for information on the REALISE network: http://realise.unistra.fr/en/the-network/

See the following for information on the research being conducted on natural and man-made risks: http://realise.unistra.fr/en/risques-naturels-et-anthropiques/

23    See the following for the full report: http://www.dhs.gov/xlibrary/assets/nipp_it_baseline_risk_assessment.pdf

24    See the following for the full report: http://www.dhs.gov/xlibrary/assets/itsrm-for-internet-routing-report.pdf

25    Some general overarching risk and threat themes exist regardless of industry or mission, such as the ones cited below:

  1. See the following report for an overview of risks associated with the supply chain and transportation areas common to all businesses today in some form:

    -   http://www3.weforum.org/docs/WEF_SCT_RRN_NewModelsAddressingSupplyChainTransportRisk_IndustryAgenda_2012.pdf

  2. See the following report for an overview of risks associated with talent acquisition, retention and management that are common to all businesses today:

    -   http://www3.weforum.org/docs/PS_WEF_GlobalTalentRisk_Report_2011.pdf

26    The Insurance Institute for Business & Home Safety produces a yearly commercial series of articles on a variety of topics important to security architects and businesses. See the following for their article “ Recovery Priorities for Cost-Effective Business Continuity Planning: http://disastersafety.org/wp-content/uploads/03_comms-priorities.pdf

27    See the following for an overview abstract of the ISO/IEC 27031 standard: http://www.iso.org/iso/catalogue_detail?csnumber=44374

28    See the following for an overview abstract of the PAS 200 Specification: http://shop.bsigroup.com/en/ProductDetail/?pid=000000000030252035

See the following for an overview summary of the Publically Available Specification, PAS 200, which the UK Cabinet Office and the British Standards Institute (BSI) jointly published, after a peer review process. http://www.regesterlarkin.com/uploads/PAS_200_An_assessment_by_Regester_Larkin_2011.pdf

29    See the following for an overview abstract of the Published Document, PD 25666: http://shop.bsigroup.com/ProductDetail/?pid=000000000030203702

See the following for additional summary information on PD 25666: http://www.continuityforum.org/content/news/press_release/126935/bsi-guidance-exercising-and-testing-pd256662010-now-available

30    See the following for an overview abstract of the Published Document, PD 25111: http://shop.bsigroup.com/en/ProductDetail/?pid=000000000030229830

31    See the following for an overview abstract of the ISO/IEC 24762 standard: http://www.iso.org/iso/catalogue_detail?csnumber=41532

See the following for a Draft Standards copy of the ISO/IEC 24762 standard as suggested for implementation by the Uganda National Bureau of Standards in 2012. http://www.unbs.go.ug/resources/DUS%20ISO%20IEC%2024762.pdf

32    See the following for an overview abstract of the ISO/PAS 22399 standard: http://www.iso.org/iso/catalogue_detail?csnumber=50295

See the following for an overview summary of the ISO/PAS 22399 standard from the Ghana Standards Board: http://www.gsa.gov.gh/site/pdf/ISO%20PAS%2022399.pdf?phpMyAdmin=2dc4ecf5c1bt1ce2db13

33    See the following for an overview abstract of the ISO/IEC 27001 standard: http://www.iso.org/iso/catalogue_detail?csnumber=42103

See the following for the YouTube ISO/IEC27001 channel: http://www.youtube.com/channel/HCaqgODxx3tGs

See the following for the British Standards Institute’s (BSI) product guide for ISO/IEC27001: http://www.bsigroup.com/Documents/iso-27001/resources/BSI-ISOIEC27001-Product-Guide-UK-EN.pdf

34    See the following for the NIST Special Publication 800-34 Rev.1 Contingency Planning Guide for Federal Information Systems: http://csrc.nist.gov/publications/nistpubs/800-34-rev1/sp800-34-rev1_errata-Nov11-2010.pdf

35    See the following for an academic paper on the data currency problem with regards to how to determine the currency of data within the constraints of a system defined architecture. http://homepages.inf.ed.ac.uk/fgeerts/pdf/currency.pdf

See the following paper for research on a data currency model: http://informatique.umons.ac.be/ssi/jef/time-wijsen.pdf

36    SHARE is a volunteer-run user group for IBM mainframe computers that was founded in 1955 by Los Angeles-area IBM 701 users. It has evolved into a forum for exchanging technical information about programming languages, operating systems, database systems, and user experiences for enterprise users of small, medium, and large-scale IBM computers. The SHARE web site can be found here: http://www.share.org/p/cm/ld/fid=1

In 1992, the SHARE user group in the United States, in combination with IBM, defined a set of Disaster Recovery tier levels. The Seven Tiers of Disaster Recovery solutions offer a simple methodology for how to define the current service level, the current risk, and the target service level and target environments for a business. The 7 tiers are as follows:

Tier 0: No off-site data – Possibly no recovery

Tier 1: Data backup with no hot site

Tier 2: Data backup with a hot site

Tier 3: Electronic vaulting

Tier 4: Point-in-time copies

Tier 5: Transaction integrity

Tier 6: Zero or near-Zero data loss

Tier 7: Highly automated, business integrated solution

The original publication document that defines and discusses the 7 tiers can be found here: http://www.redbooks.ibm.com/redbooks/pdfs/sg246844.pdf

37    See the following for the NIST Special Publication 800-34 Rev.1 Contingency Planning Guide for Federal Information Systems: http://csrc.nist.gov/publications/nistpubs/800-34-rev1/sp800-34-rev1_errata-Nov11-2010.pdf (page 32).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset