Chapter 21. Testing a Data Warehouse

A data warehouse is a central repository of data made available to users. The centralized storage of data provides significant processing advantages but at the same time raises concerns of the data’s security, accessibility, and integrity. This chapter focuses on where testing would be most effective in determining the risks associated with those concerns.

Overview

This testing process lists the more common concerns associated with the data warehouse concept. It also explains the more common activities performed as part of a data warehouse. Testing begins by determining the appropriateness of those concerns to the data warehouse process under test. If appropriate, the severity of the concerns must be determined. This is accomplished by relating those high-severity concerns to the data warehouse activity controls. If in place and working, the controls should minimize the concerns.

Concerns

The following are the concerns most commonly associated with a data warehouse:

  • Inadequate assignment of responsibilities. There is inappropriate segregation of duties or failure to recognize placement of responsibility.

  • Inaccurate or incomplete data in a data warehouse. The integrity of data entered in the data warehouse is lost because of inadvertent or intentional acts.

  • Losing an update to a single data item. One or more updates to a single data item can be lost because of inadequate concurrent update procedures.

  • Inadequate audit trail to reconstruct transactions. The use of data by multiple applications may split the audit trail among those applications and the data warehouse software audit trail.

  • Unauthorized access to data in a data warehouse. The concentration of data may make sensitive data available to anyone who gains access to the data warehouse.

  • Inadequate service level. Multiple users vying for the same resources may degrade the service to all because of excessive demand or inadequate resources.

  • Placing data in the wrong calendar period. Identifying transactions with the proper calendar period is more difficult in some online data warehouse environments than in others.

  • Failure of data warehouse software to function as specified. Vendors provide most data warehouse software, making the data warehouse administrator dependent on the vendor to ensure the proper functioning of the software.

  • Improper use of data. Systems that control resources are always subject to misuse and abuse.

  • Lack of skilled independent data warehouse reviewers. Most reviewers are not skilled in data warehouse technology and, thus, have not evaluated data warehouse installations.

  • Inadequate documentation. Documentation of data warehouse technology is needed to ensure consistency of understanding and use by multiple users.

  • Loss of continuity of processing. Many organizations rely heavily on data warehouse technology for the performance of their day-to-day processing.

  • Lack of criteria to evaluate. Without established performance criteria, an organization cannot be assured that it is achieving its data warehouse goals.

  • Lack of management support. Without adequate resources and “clout,” the advantages of data warehouse technology may not be achieved.

Workbench

Figure 21-1 illustrates the workbench for testing the adequacy of the data warehouse activity. The workbench is a three-task process that measures the magnitude of the concerns, identifies the data warehouse activity processes, and then determines the tests necessary to determine whether the high-magnitude concerns have been adequately addressed. Those performing the test must be familiar with the data warehouse activity processes. The end result of the test is an assessment of the adequacy of those processes to minimize the high-magnitude concerns.

Workbench for testing a data warehouse.

Figure 21-1. Workbench for testing a data warehouse.

Input

Organizations implementing the data warehouse activity need to establish processes to manage, operate, and control that activity. The input to this test process is knowledge of those data warehouse activity processes. If the test team does not have that knowledge, it should be supplemented with one or more individuals who possess a detailed knowledge of the data warehouse activity processes.

Enterprise-wide requirements are data requirements that are applicable to all software systems and their users. Whenever anyone accesses or updates a data warehouse, that process is subject to the enterprise-wide requirements. They are called enterprise-wide requirements because they are defined once for all software systems and users.

Each organization must define its own enterprise-wide controls. However, testers should be aware that many IT organizations do not define enterprise-wide requirements. Therefore, testers need to be aware that there may be inconsistencies between software systems and/or users. For example, if there are no security requirements applicable enterprise-wide, each software system may have different security procedures.

Enterprise-wide requirements applicable to the data warehouse include but are not limited to the following:

  • Data accessibility. Who has access to the data warehouse, and any constraints or limitations placed on that access.

  • Update controls. Who can change data within the data warehouse as well as the sequence in which data may be changed in the data warehouse.

  • Date controls. The date that the data is applicable for different types of processes. For example, with accounting data it is the date that the data is officially recorded on the books of the organization.

  • Usage controls. How data can be used by the users of the data warehouse, including any restrictions on users forwarding data to other potential users.

  • Documentation controls. How the data within the data warehouse is to be described to users.

Do Procedures

To test a data warehouse, testers should perform the following three tasks:

  1. Measure the magnitude of data warehouse concerns.

  2. Identify data warehouse activities to test

  3. Test the adequacy of data warehouse activity processes

Task 1: Measure the Magnitude of Data Warehouse Concerns

This task involves two activities. The first activity is to confirm that the 14 data warehouse concerns described earlier are appropriate for the organization. The list of concerns can be expanded or reduced. In addition, it may be advisable to change the wording of the concerns to wording more appropriate for the culture of the organization under test. For example, Concern 1 is inadequate assignment of responsibilities. If it is more appropriate in your organization to talk of job description responsibilities, make the appropriate change.

Once the list of potential data warehouse concerns has been finalized, the magnitude of those concerns must be determined. Work Paper 21-1 should be used to rate the magnitude of the data warehouse concerns. If the list of concerns has been modified, Work Paper 21-1 will also have to be modified.

Table 21-1. Rating the Magnitude of Data Warehouse Concerns

Worksheet Concern #1: Inadequate Assignment of Responsibilities

Description of Concern:

There is inappropriate segregation of duties or failure to recognize placement of responsibility.

  

YES

NO

COMMENTS

1.

Has a charter been established for the database administration function outlining the role and responsibilities for the function?

   

2.

Have the user responsibilities regarding the integrity of the data warehouse been defined?

   

3.

Have job descriptions been modified for all individuals interfacing with the data warehouse to define their data warehouse responsibilities?

   

4.

Have job descriptions been developed for full-time data warehouse administration personnel?

   

5.

Has a formal method of resolving data warehouse disputes been established?

   

6.

Does the organization have a data policy which outlines organizational data responsibility?

   

7.

Are the functions being performed by data warehouse administration within that administration’s formal role and responsibility?

   
 

Percent of No responses

 

%

 

Worksheet Concern #2: Inaccurate or Incomplete Data in a Data Warehouse

Description of Concern:

The integrity of data entered in the data warehouse is lost due to inadvertent or intentional acts.

  

YES

NO

COMMENTS

1.

Has each element of data in the data warehouse been identified?

   

2.

Have the data validation rules for each data element been documented?

   

3.

Have the data validation rules for each data element been implemented?

   

4.

Are the data validation rules adequate to ensure the accuracy of data?

   

5.

Have procedures been established to ensure the consistence of redundant data elements?

   

6.

Have procedures been established for the timely correction of data entry errors?

   

7.

Are procedures established to promptly notify all users of the data warehouse when an inaccuracy or incomplete data condition has been identified?

   

8.

Are the data warehouse administration tools and techniques adequate to ensure the consistency of redundant data elements?

   
 

Percent of No responses

 

%

 

Worksheet Concern #3: Losing an Update to a Single Data Item

Description of Concern:

One or more updates to a single data item can be lost due to inadequate concurrent update procedures.

  

YES

NO

COMMENTS

1.

Does the data warehouse software in use have a lockout feature to prevent concurrent updates to a single data item?

   

2.

Does the data warehouse software have a feature to resolve deadlock in accessing data (for example, user A has item 1 and wants item 2, while user B has item 2 and wants item 1)?

   

3

Has the sequencing of updates to the data warehouse been defined?

   

4.

Are there controls in the data warehouse software to ensure that events can only be recorded in the predetermined sequence?

   

5.

Have the parties that can create, update, or delete a data element been identified?

   
 

Percent of No responses

 

%

 

Worksheet Concern #4: Inadequate Audit Trail

Description of Concern:

The use of data by multiple applications may split the audit trail among those applications and the data warehouse software audit trail.

  

YES

NO

COMMENTS

1.

Has the audit trail for data warehouse applications been identified and documented?

   

2.

Has the retention period for each part of the data warehouse audit trail been determined?

   

3.

Is a data warehouse software log maintained?

   

4.

Does management determine what information will be maintained in the data warehouse software log?

   

5.

Can the audit trail trace source transactions to control totals and trace control totals back to the initiating transactions?

   

6.

Can the audit trail provide the evidence needed to reconstruct transaction processing?

   

7.

Is the audit trail in operation whenever the data warehouse is in operation?

   

8.

Are all overrides of normal data warehouse software procedures recorded on the data warehouse software log?

   

9.

Can the application audit trail records be cross-referenced to the data warehouse software log audit trail records?

   
 

Percent of No responses

 

%

 

Worksheet Concern #5: Unauthorized Access to Data in a Data Warehouse

Description of Concern:

The concentration of sensitive data may make it available to anyone gaining access to a data warehouse.

  

YES

NO

COMMENTS

1.

Have all of the data elements requiring security procedures been identified?

   

2.

Have all of the data warehouse users been identified?

   

3.

Has a user profile been established indicating which resources can be accessed by which users?

   

4.

Has the enforcement of the user profile been automated?

   

5.

Is the access mechanism, such as passwords, protected from unauthorized manipulation?

   

6.

Has the organization established a data warehouse security officer function (note that this need not be a full-time function)?

   

7.

Are security violators promptly punished?

   

8.

Are formal records maintained on security violations?

   

9.

Are security violation summaries presented to management in regular reports?

   
 

Percent of No responses

 

%

 

Worksheet Concern #6: Inadequate Service Level

Description of Concern:

Multiple users contesting for the same resources may degrade the service to all due to excessive demand or inadequate resources.

  

YES

NO

COMMENTS

1.

Has the level of service that is desired been documented?

   

2.

Are procedures established to monitor the desired level of service to users?

   

3.

Are users encouraged, by the use of such techniques as varying chargeout rates, to spread out their nonurgent processing?

   

4.

Have the identified options to improve service when it degrades been identified?

   

5.

Does the data warehouse administrator continually monitor the service level and make adjustments where appropriate?

   

6.

Are steps to take established at points where service level degrades?

   

7.

Do procedures identify the cause of degradation in service, such as a single user consuming exorbitant amounts of resources, so that action can be taken to eliminate those causes where appropriate?

   
 

Percent of No responses

 

%

 

Worksheet Concern #7: Placing Data in the Wrong Calendar Period

Description of Concern:

Identifying transactions with the proper calendar period is more difficult in some on-line data warehouse environments than in others.

  

YES

NO

COMMENTS

1.

Do procedures identify the criteria for determining into which accounting period transactions are placed?

   

2.

Are all postdated transactions date-stamped to identify the accounting period in which they belong?

   

3.

Are procedures established to cut off processing at the end of significant accounting periods, such as at year-end?

   

4.

For applications where data must be segregated into accounting periods, are significant transactions entered both immediately before and immediately after the accounting cutoff period manually reviewed to ensure they are in the appropriate accounting period?

   

5.

Are formal procedures established to move data from one accounting period to another if appropriate?

   
 

Percent of No responses

 

%

 

Worksheet Concern #8: Failure of Data Warehouse Software to Function as Specified

Description of Concern:

Most data warehouse software is provided by vendors, making the data administrator dependent on the vendor to assure the proper functioning of the software.

  

YES

NO

COMMENTS

1.

Have the processing expectations been determined?

   

2.

Is the data warehouse software evaluated to determine that it performs in accordance with the predetermined requirements?

   

3.

Is each new release of data warehouse software thoroughly tested?

   

4.

Has a maintenance contract for the data warehouse software been established?

   

5.

Are procedures established to identify data warehouse software problems?

   

6.

Are operations personnel trained to identify and report data warehouse software problems?

   

7.

Have backup procedures been developed for use in the event of a data warehouse software failure?

   

8.

Are data warehouse software failures recorded and regularly reported to the data warehouse administrator?

   

9.

Are the vendors promptly notified in the event of a data warehouse software problem so that they can take appropriate action?

   
 

Percent of No responses

 

%

 

Worksheet Concern #9: Fraud/Embezzlement

Description of Concern:

Systems that control resources are always subject to fraud and embezzlement.

  

YES

NO

COMMENTS

1.

Do data warehouse administration personnel have access to the data in the data warehouse?

   

2.

Has methodology been established for designing data warehouse controls?

   

3.

Has the data warehouse been reviewed within the last year by an independent reviewer?

   

4.

Have procedures been established to identify and report errors, omissions, and frauds to senior management?

   

5.

Are all data warehouse resources access controlled?

   

6.

Are passwords or other access control procedures changed at least every six months?

   

7.

Are all error messages acted upon in a timely fashion?

   

8.

Are deviations from normal processing investigated?

   

9.

Do data validation routines anticipate and report on unusual processing?

   
 

Percent of No responses

 

%

 

Worksheet Concern #10: Lack of Independent Data Warehouse Reviews

Description of Concern:

Most reviewers are not skilled in data warehouse technology and thus have not evaluated data warehouse installations; in addition, many auditor software packages cannot access data warehouse software.

  

YES

NO

COMMENTS

1.

Is there an internal audit function having jurisdiction over reviewing data warehouse technology?

   

2.

Is there an EDP quality assurance group having jurisdiction over reviewing data warehouse technology?

   

3.

Does either of these groups have adequate skills to perform such a review?

   

4.

Has an independent review of data warehouse technology been performed within the last 12 months?

   

5.

Was a report issued describing the findings and recommendations from that review?

   

6.

Were the findings and recommendations reasonable based upon the current use of data warehouse technology?

   

7.

Is an independent review of data warehouse technology planned during the next 12 months?

   
 

Percent of No responses

 

%

 

Worksheet Concern #11: Inadequate Documentation

Description of Concern:

Documentation of data warehouse technology is needed to ensure consistency of understanding and use by multiple users.

  

YES

NO

COMMENTS

1.

Do data documentation standards exist?

   

2.

Are data documentation standards enforced?

   

3.

Is a data dictionary used to document the attributes of data elements?

   

4.

Is a data dictionary integrated into the data warehouse software operation, so that the only entry into data warehouse software-controlled data is through the data dictionary?

   

5.

Does the data warehouse administration group provide counsel in documenting and using data?

   

6.

Does the data documentation contain the data validation rules?

   
 

Percent of No responses

 

%

 

Worksheet Concern #12: Continuity of Processing

Description of Concern:

Many organizations rely heavily on data warehouse technology for the performance of their day-to-day processing.

  

YES

NO

COMMENTS

1.

Have the potential causes of data warehouse failure been identified?

   

2.

Has the impact of each of those failures on the organization been assessed?

   

3.

Have procedures been developed to continue processing during a data warehouse failure?

   

4.

Are procedures established to ensure that the integrity of the data warehouse can be restored after data warehouse failure?

   

5.

Has the sequence of actions necessary to restore applications after a data warehouse failure been documented?

   

6.

Have computer operations personnel been trained to data warehouse recovery procedures?

   

7.

Is sufficient backup data stored off-site to permit reconstruction of processing in the event of a disaster?

   

8.

Are records maintained on data warehouse failures so that specific analysis can be performed?

   
 

Percent of No responses

 

%

 

Worksheet Concern #13: Lack of Performance Criteria

Description of Concern:

Without established performance criteria, an organization cannot be assured that it is achieving data warehouse goals.

  

YES

NO

COMMENTS

1.

Have measurable objectives for data warehouse technology been established?

   

2.

Are those objectives monitored to determine whether they are achieved?

   

3.

Can the cost associated with data warehouse technology be identified?

   

4.

Can the benefits associated with data warehouse technology be identified?

   

5.

Was a cost/benefit analysis prepared for the installation and operation of data warehouse technology?

   

6.

Has the cost/benefit projection been monitored to measure whether those projections have been achieved?

   

7.

Is the achievement of the performance criteria evaluated by an independent group, such as EDP quality assurance?

   
 

Percent of No responses

 

%

 

Worksheet Concern #14: Lack of Management Support

Description of Concern:

Without adequate resources and “clout,” the advantages of data warehouse technology may not be achieved.

  

YES

NO

COMMENTS

1.

Has a member of senior management been appointed responsible for managing data for the organization?

   

2.

Was senior management involved in the selection of the organization’s data warehouse technology approach?

   

3.

Has a review board been established comprising users, EDP personnel, and senior managers to oversee the use of data warehouse technology?

   

4.

Has data processing management attended courses on the use of data warehouse technology?

   

5.

Has senior management requested regular briefing and/or reports on the implementation and use of data warehouse technology?

   

6.

Has senior management been involved in the preparation of a long-range plan for use of information in the organization?

   

7.

Is senior management involved in the settlement of disputes over the attributes or use of information in the organization?

   
 

Percent of No responses

 

%

 

To use Work Paper 21-1, a team of testers knowledgeable in both testing and the data warehouse activity should be assembled. For each concern, Work Paper 21-1 lists several criteria. The criteria should each be answered with a Yes or No response. The test team should have a consensus on the response. A Yes response means that the criterion has been met. Being met means that it is both in place and used. For example, Criterion 1 for Concern 1 asks whether a charter has been established for a data warehouse administration function. A Yes response means that the charter has been established and is, in fact, in place and used. A No response means that either the criterion has not been established or it is not being used. The Comments column is available to clarify the Yes and No responses.

At the conclusion of rating the criteria for each concern, the percent of No responses should be calculated. For example, the first concern lists seven criteria. If three of the seven criteria have a No response, then approximately 43 percent would have received a No response.

When Work Paper 21-1 has been completed, the results should be posted to Work Paper 21-2. For example, if Concern 1 received 43 percent of No responses, the bar on Work Paper 21-2 would be completed vertically above Concern 1 on Work Paper 21-2 to 43 percent. This would put that concern in the “medium” category. At the conclusion of this task, Work Paper 21-2 will show the magnitude of the data warehouse concerns.

Table 21-2. Magnitude of Data Warehouse Concerns

Field Requirements

FIELD

INSTRUCTIONS FOR ENTERING DATA

% of No Responses

This column divides the percentage of No responses into three categories: low, medium, and high.

Concern Ratings

The rating for the concerns listed on this work paper represents the percentages of No responses calculated for those concerns on Work Paper 21-1.

Data Warehouse Concerns

These are the 14 data warehouse concerns described earlier in this chapter.

Magnitude of Data Warehouse Concerns

% OF NO RESPONSES

CONCERN RATINGS

              

100%

High

              

68%

               

67%

Medium

              

34%

               

33%

Low

              

0%

               

Task 2: Identify Data Warehouse Activity Processes to Test

There are many ways organizations can establish a data warehouse activity. Associated with the data warehouse are a variety of processes. This section describes the more common processes associated with data warehouse activity.

Organizational Process

The data warehouse introduces a new function into the organization, and with that function comes a shifting of responsibilities. Much of this shifting involves a transfer of responsibilities from the application system development areas and the user areas to a centralized data warehouse administration function.

The introduction of the data warehouse is normally associated with the organization of a formal data warehouse administration group. This group usually reports within the data processing function and frequently directly to the data processing manager. The objective of the data warehouse administration function is to oversee and direct the installation and operation of the data warehouse.

The data warehouse administration function normally has line responsibilities for data documentation, system development procedures, and standards for those applications using data warehouse technology. The database administrator (DBA) function also has indirect or dotted-line responsibilities to computer operations and users of data warehouse technology through providing advice and direction. In addition, the data warehouse administrator should be alert to potential problems and actively involved in offering solutions.

Studies on the success of data warehouse technology strongly indicate the need for planning. A key part of this planning is the integration of the data warehouse into the organizational structure. This integration requires some reorganization within both the data processing and user areas.

Data Documentation Process

The transition to data warehouse technology involves the switching of information technology emphasis from processing to data. Many existing systems are process-driven, whereas data warehouse technology involves data-driven systems. This change in emphasis necessitates better data documentation.

If multiple users are using the same data, documentation should be easy-to-use and complete. Misunderstandings regarding the data’s content, reliability, consistency, and so on will lead to problems in the data’s interpretation and use. Clear, distinct documentation helps reduce this risk.

Many organizations use standardized methods of data documentation. The simplest method is to use forms and written procedures governing the method of defining data. More sophisticated installations use data dictionaries. The data dictionary can be used as a standalone automated documentation tool or integrated into the processing environment.

The data warehouse administrator normally oversees the use of the data dictionary. This involves determining what data elements will be documented, the type and extent of documentation requested, and assurance that the documentation is up-to-date and in compliance with the documentation quality standards.

The documentation requirement for data is a threefold responsibility. First, individuals must be educated in the type of documentation required and provide that documentation. Second, the documentation must be maintained to ensure its accuracy and completeness. Third, the data used in the operating environment must conform to the documentation. If the data in operation is different from the documentation specifications, the entire process collapses.

System Development Process

Data warehouse technology is designed to make system development easier; however, this occurs only when the application system fits into the existing data hierarchy. If the system requirements are outside the data warehouse structure, it may be more difficult and costly to develop that system by using the data warehouse than by using non–data warehouse methods.

One method of ensuring that applications effectively use data warehouse technology is to have data warehouse administration personnel involved in the development process. In other words, more front-end planning and assessment are required to ensure the effective use of data warehouse technology than when the data warehouse is not used. This front-end effort also ensures that the application project team understands the resources available through data warehouse technology.

The data warehouse is a continually changing grouping of data. Part of the data warehouse involvement in system development is to adjust and modify the structure continually to meet the changing needs of application systems. Thus, the development process for the data warehouse is twofold: to ensure that the applications effectively use the data warehouse, and to establish new data warehouse directions in order to keep the data warehouse in step with application needs.

The system development process in the data warehouse technology has the following three objectives:

  • To familiarize the system’s development people with the resources and capabilities available

  • To ensure that the proposed application system can be integrated into the existing data warehouse structure, and if not, to modify the application and/or the data warehouse structure

  • To ensure that application processing will preserve the consistency, reliability, and integrity of data in the data warehouse

A problem that often surfaces in the use of data warehouse technology is how to charge for data warehouse usage. Some applications operate more effectively using the data warehouse than others. Experience has shown that moving some applications from non–data warehouse to data warehouse technology substantially reduces the cost of processing, while in other instances the cost increases. The reason for this is that the data warehouse must be optimized toward a specific data usage. Obviously, the data warehouse administrator must attempt to optimize the data warehouse toward the high-usage applications and let the usage be less efficient in the small-volume applications. The costs associated with a data warehouse should not be allowed to discourage its use.

Access Control Process

One of the major concerns for management about the data warehouse is the ready accessibility of information. As more data is placed into a single repository, that repository becomes more valuable to perpetrators.

The access control function has two primary purposes. The first is to identify the resources requiring control and determine who should be given access to those resources. The second is to define and enforce the control specifications identified in the previous responsibility in the operating environment.

The access control function can be performed by the data warehouse administration function or an independent security officer. Obviously, an independent function is stronger than the same function that administers the data warehouse. The method selected will depend on the value of the information in the data warehouse and the size of the organization. The more valuable the data, or the larger the organization, the more likely it is that the function will be implemented through an independent security officer.

The enforcement of the security profile for the data warehouse in online systems is performed by security software. Some data warehouse management systems have security features incorporated in the data warehouse software, whereas others need to be supplemented by security packages. Many of the major hardware vendors, such as IBM, provide security software. In addition, several independent vendors provide general-purpose security software that interfaces with many data warehouse software systems.

The access control function has the additional responsibility of monitoring the effectiveness of security. Detecting and investigating potential access violations are important aspects of data warehouse access control. Unless the access control procedures are monitored, violators will not be detected; and if violators are not reprimanded or prosecuted, there will be little incentive for other involved parties to comply with access control rules.

Data Integrity Process

The integrity of the contents of the data warehouse is the joint responsibility of the users and the data warehouse administrator. The data warehouse administrator is concerned more about the integrity of the structure and the physical records, while the users are concerned about the contents or values contained in the data warehouse.

The integrity of dedicated files is primarily the responsibility of the user. The data processing department has a responsibility to use the correct version of the file and to add features that protect the physical integrity of the records on the file. However, the ultimate responsibility for the integrity resides with the user, and the application systems need to be constructed to ensure that integrity. This is usually accomplished by accumulating the values in one or more control fields and developing an independent control total that can be checked each time the file is used.

In a data warehouse environment, the traditional integrity responsibilities change. No longer does a single user have control over all the uses of data in a data warehouse. Several different application systems may be able to add, delete, or modify any single data element in the data warehouse. For example, in an airline reservation system, any authorized agent can commit or delete a reserved seat for a flight. On the other hand, the data warehouse administrator doesn’t have control over the uses of the data in the data warehouse. This means that the data integrity must be ensured through new procedures.

The data integrity process may involve many different groups within an organization. These groups, such as various users and the data warehouse administration function, will share parts of this data integrity responsibility. In fulfilling data integrity responsibility, the following tasks need to be performed:

  1. Identify the method of ensuring the completeness of the physical records in the data warehouse.

  2. Determine the method of ensuring the completeness of the logical structure of the data warehouse (i.e. schema).

  3. Determine which users are responsible for the integrity of which segments of the data warehouse.

  4. Develop methods to enable those users to perform their data integrity responsibilities.

  5. Determine the times at which the integrity of the data warehouse will be verified.

Operations Process

The normal evolution of data warehouse operations is from the data warehouse administration function to specialized operations personnel and then to regular computer operators. This evolution is necessary so that an organization can develop the appropriate skills and methods needed for training and monitoring regular computer operators. Without taking the appropriate time to develop skills, operators may be placed into a position where their probability of success is minimal.

Data warehouse technology is more complex than non–data warehouse technology. Normally, the data warehouse is coupled with communication technology. This means that two highly technical procedures are coupled, making the resulting technology more complex than either of the contributing technologies.

Most of the day-to-day operations are performed by users. The data warehouse provides the basis for a user-operated technology. One of the advantages of the data warehouse is the powerful utilities associated with the technology that are available to the users. One of the more powerful utilities is the query languages, which provide almost unlimited capabilities for users to prepare analyses and reports using the data within the data warehouse.

Computer operators face the following challenges when operating data warehouse technology:

  • Monitoring space allocation to ensure minimal disruptions because of space management problems

  • Understanding and using data warehouse software operating procedures and messages

  • Monitoring service levels to ensure adequate resources for users

  • Maintaining operating statistics so that the data warehouse performance can be monitored

  • Reorganizing the data warehouse as necessary (usually under the direction of the data warehouse administrator) to improve performance and add capabilities where necessary

Backup/Recovery Process

One of the most technically complex aspects of data processing is recovering a crashed data warehouse. Recovery can occur only if adequate backup data is maintained. The recovery procedure involves the following four major challenges:

  1. Verifying that the integrity of the data warehouse has been lost.

  2. Notifying users that the data warehouse is inoperable and providing them with alternate processing means. (Note: These means should be predetermined and may be manual.)

  3. Ensuring and having adequate backup data ready.

  4. Performing the necessary procedures to recover the integrity of the data warehouse.

Many data warehouses are operational around the clock during business days, and some, seven days a week. It is not uncommon for many thousands of transactions to occur in a single day. Thus, unless recovery operations are well planned, it may take many hours or even days to recover the integrity of the data warehouse. The complexity and planning that must go into data warehouse contingency planning cannot be overemphasized.

The responsibility for data warehouse recovery is normally that of computer operations. However, the users and data warehouse administrators must provide input into the recovery process. The data warehouse administrator usually develops the procedures and acquires the recovery tools and techniques. The user provides the specifications on time span for recovery and determines what alternate procedures are acceptable during this period.

One of the problems encountered is notifying users that the data warehouse is no longer operational. In larger organizations, there may be many users, even hundreds of users, connected to a single data warehouse. It may take longer to notify the users that the data warehouse is not operational than it will take to get the data warehouse back online. The group involved in the recovery may not have adequate resources to inform all of the users. Some of the procedures to inform users of the loss of data warehouse integrity include:

  • Sending messages to terminals if facilities to transmit are available

  • Having a telephone number that users can call to determine if the data warehouse is down

  • Providing users with the service expectations so that when those expectations are not met the users can assume a data warehouse problem has occurred

The backup/recovery process begins with determining what operations must be recovered and in what time frame. This provides the recovery specifications. From these specifications, the procedures are developed and implemented to meet the recovery expectations. Much of the process involves collecting and storing backup data; thus, it is very important for all involved parties to agree on what backup data is needed.

Performing Task 2

Two events are associated with this task. First, use Work Paper 21-3 to determine whether the preceding activities are appropriate to your data warehouse activity, supplementing or reducing the list of data warehouse activities, as necessary. In addition, you should change the process name to the specific vocabulary of your organizational culture.

Table 21-3. Data Warehouse Activity Process

  

APPROPRIATE

COMMENTS

YES

NO

N/A

1.

Organizational Process

    

2.

Data Documentation Process

    

3.

System Development Process

    

4.

Access Control Process

    

5.

Data Integrity Process

    

6.

Operations Process

    

7.

Backup/Recovery Process

    

Task 3: Test the Adequacy of Data Warehouse Activity Processes

This task is to evaluate that each of the seven identified processes contains controls that are adequate to reduce the concerns identified earlier in this chapter. A control is any means used to reduce the probability of a failure. The determination of whether the individual applications enter, store, and use the correct data is made using the seven-step process included in Part Three of this book.

Figure 21-2 indicates which activity processes should be tested. This figure is used by first identifying the significant data warehouse concerns, as determined in Task 1. The check marks in the “Data Warehouse Activity Processes” columns indicate which processes should reduce the concerns so that the probability of failure is minimized. For example, if a significant data warehouse concern is that “there is an inadequate assignment of responsibility,” then the three data activity processes of organization, system development, and access control should be tested.

Which data warehouse activities should be tested?

Figure 21- 2. Which data warehouse activities should be tested?

Figure 21-3 shows the types of tests that should be undertaken for each data warehouse activity process for the identified concerns. The tests are those focused on determining that specific controls exist. If those controls exist, then the testers can assume that the process is adequately controlled so that the probability of failure is minimized.

Table 21-3. Organizational control objectives.

ORGANIZATIONAL CONTROL OBJECTIVES

Concern Number

Concern

Test should determine that a control exists

1.

Inadequate assignment of responsibilities

1. To assign data warehouse responsibilities to individuals

  

2. To see that user retains organizational responsibility for the accuracy, completeness, and security of data

  

3. To perform independent reviews to ensure the adequate assignment of responsibilities

2.

Inaccurate or incomplete data in a data warehouse.

1. To see that the organizational structure is designed to ensure the adequate assignment of responsibilities

8.

Failure of the data warehouse software to function as specified

1. To see that the organizational structure is designed to ensure prompt detection and correction of data warehouse software errors

  

2. To document data warehouse expectations

9.

Fraud/embezzlement

1. To divide responsibilities so that an individual cannot perform and conceal a single event

10.

Lack of independent data warehouse reviews

1. To see that a data warehouse review group is established that is independent of the data warehouse function

  

2. To define review responsibilities

11.

Inadequate documentation

1. To document departmental data warehouse organizational responsibilities in the department charter

  

2. To document individual data warehouse responsibilities in their job description

13.

Lack of performance criteria

1. To define data warehouse expectations in measurable terms

14.

Lack of management support

1. To ensure that senior management defines and enforces data policy

  

2. To ensure that senior management participates in data warehouse decision making

  

3. To ensure that senior management supports independent data warehouse review groups

DATA DOCUMENTATION CONTROL OBJECTIVES

4.

Inadequate audit trail

1. To define data warehouse audit trail requirements

  

2. To divide requirements between the user and the DBA function

  

3. To document data warehouse deletions

7.

Placing data in the wrong calendar period

1. To define data accounting requirements

8.

Failure of data warehouse software to function as specified

1. To assign centralized control of external schema

  

2. To define data independently of the applications that use the data

11.

Inadequate documentation

1. To develop an inventory of data elements

  

2. To document data in accordance with documentation standards

  

3. To enforce the use of data as documented

SECURITY/ACCESS CONTROL OBJECTIVES

Concern Number

Concern

Test should determine that a control exists

1.

Inadequate assignment of responsibilities

1. To assign responsibility for security to a function independent of the one requiring security

5.

Unauthorized access in the data warehouse

1. To define access to each data warehouse resource

  

2. To include all individuals involved in data warehouse in the access control process

  

3. To ensure prompt punishment of violators

  

4. To create logs of security-related activities

9.

Fraud/embezzlement

1. To see that security measures address the common methods of fraud

12.

Continuity of processing

1. To ensure that visitors and service personnel are escorted

  

2. To assess the risks of security problems on disruptions to processing

14.

Lack of management support

1. To see that management establishes the desired level of security

  

2. To see that management supports punishment for security violations

COMPUTER OPERATIONS ACTIVITY CONTROL OBJECTIVES

2.

Inaccurate or incomplete data in a data warehouse

1. To ensure that data is not lost or changed due to improper operations

5.

Unauthorized access in a data warehouse

1. To physically protect the data warehouse from unauthorized access

6.

Inadequate service level

1. To minimize both the frequency and the impact of inadequate service level

  

2. To monitor service-level performance

8.

Failure of the data warehouse software to function as specified

1. To monitor data warehouse software failures to determine responsibility and implement fixes as appropriate

11.

Inadequate documentation

1. To document data warehouse software operating procedures and controls

12.

Continuity of processing

1. To plan for expected capacity requirements

  

2. To minimize data warehouse software downtime

13.

Lack of performance criteria

1. To establish data warehouse software expectations

DATA WAREHOUSE BACKUP/RECOVERY CONTROL OBJECTIVES

Concern Number

Concern

Test should determine that a control exists

2.

Inaccurate or incomplete

1. To verify controls after recovery to ensure the integrity of the recovered data warehouse

4.

Inadequate audit trail

1. To maintain records on the recovery process

6.

Inadequate service level

1. To include segments of the application in the recovery process

  

2. To specify assignments

  

3. To retain adequate backup data

8.

Failure of the data warehouse software to function as specified

1. To test the recovery process

11.

Inadequate documentation

1. To document recovery procedures

12.

Continuity of processing

1. To determine expected failure rates

  

2. To specify recovery requirements

  

3. To define alternate processing procedures

  

4. To inform users about service interruptions

DATA WAREHOUSE INTEGRITY CONTROL OBJECTIVES

2.

Inaccurate or incomplete data in a data warehouse

1. To verify the integrity of the initial population of the data warehouse

  

2. To validate conformance to data definition

  

3. To control access for data modification

  

4. To provide adequate backup and recovery methods

  

5. To preserve the integrity of the data warehouse

  

6. To preserve the consistency of data redundancy

  

7. To control the placement of data warehouse data on media and devices

  

8. To maintain independent data warehouse controls

  

9. To maintain data warehouse segment counts

3.

Losing an update to a single data item

1. To utilize concurrency and lockout controls

4.

Inadequate audit trail

1. To maintain adequate audit trails to permit reconstruction of processing

7.

Placing data in the wrong calendar period

1. To establish accounting controls to ensure that data is recorded in the proper calendar period

8.

Failure of data warehouse software to function as specified

1. To verify the proper functioning of the data warehouse software

  

2. To verify the correctness of the interface to the data warehouse software

11.

Inadequate documentation

1. To document the data definitions for creation of the data warehouse

SYSTEM DEVELOPMENT CONTROL OBJECTIVES

1.

Inadequate assignment of responsibilities

1. To divide system development responsibilities among the DBA function, the application project team, and the user

14.

Lack of management support

1. To ensure that senior management participates in system planning

  

2. To ensure that senior management approves data warehouse application proposals

13.

Lack of performance criteria

1. To establish performance criteria for all data warehouse applications

4.

Inadequate audit trail

1. To include the audit trail in the design specifications

2.

Inaccurate or incomplete data in the data warehouse

1. To include the methods of ensuring accurate and complete data in the design specifications

7.

Placing data in the wrong calendar period

1. To include the accounting requirements in the design specifications

11.

Inadequate documentation

1. To ensure that documentation conforms to data warehouse documentation standards

  

2. To ensure that documentation is up to date

3.

Losing an update to a single data item

1. To implement controls to ensure the proper sequencing of updates

10.

Lack of independent data warehouse reviews group

1. To establish a test plan

  

2. To have the test plan implemented or monitored by an independent group

9.

Fraud/embezzlement

1. To test the adequacy of of controls

8.

Failure of the data warehouse software to function as specified

1. To test to ensure that the system achieves specified performance criteria

6.

Inadequate service

1. To monitor the installed application to ensure that specified performance criteria are achieved

 

Check Procedures

Work Paper 21-4 is a quality control checklist for testing a data warehouse. It is designed so that Yes responses indicate good test practices and No responses warrant additional investigation. A Comments column is provided to explain No responses and to record results of investigation. The N/A response is used when the checklist item is not applicable to the test situation.

Table 21-4. Data Warehouse Quality Control Checklist

  

YES

NO

N/A

COMMENTS

1.

Does someone assigned to the test team have data warehouse skills?

    

2.

Does the tester understand the generic data warehouse concerns?

    

3.

Does the final list of data warehouse concerns represent the true concerns of your organization?

    

4.

Has the vocabulary in all of the work papers and figures been adjusted to the vocabulary in use in your organization?

    

5.

Does the test team understand the criteria that are used to determine the magnitude of the data warehouse concerns?

    

6.

Do the ratings of the magnitude of the concerns seem reasonable?

    

7.

Have the data warehouse activity processes been identified?

    

8.

Do the identified processes appear to represent the actual processes in use in the data warehouse activity?

    

9.

Does the test team understand the controls that are needed to minimize failure in each of the data warehouse activities processes?

    

10.

Does the final assessment of the test team regarding the data warehouse appear reasonable to the test team?

    

11.

Does the assessment report issued by the test team appear to represent the results of the test?

    

Output

The output from the data warehouse test process is an assessment of the adequacy of the data warehouse activity processes. The assessment report should indicate the concerns addressed by the test team, the processes in place in the data warehouse activity, and the adequacy of those processes.

Guidelines

The testing of the data warehouse activity as proposed in this chapter is one of risk assessments. It is not designed to ensure that the data warehouse will function properly for each use, but rather to apprise management of the probability that failures will be minimized or that additional management action should be taken. The actual determination of the correct processing of the warehouse should be done in conjunction with the application software that uses the data warehouse.

Summary

This chapter is designed to assist testers in evaluating the work processes associated with a data warehouse activity. It is designed to be used in conjunction with the test of application software that uses the data warehouse. The actual processing of data from the data warehouse should be tested using the seven-step process included in Part Three of this book. However, unless adequate control procedures are in place and working, the testers cannot rely on results of the one application software test to be applicable to other data warehouse applications.

If the data warehouse activity processes are adequate to address the concerns, the testers can assume that the results of testing one application will be similar to testing other applications using the data warehouse. On the other hand, if the processes do not adequately minimize the probability of failure in the data warehouse, more extensive testing may be required of all the individual applications that use the data warehouse.

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset