Chapter 15.1

The System of Record

Abstract

At the heart of all data processing is the need to believe the data on which one is operating. There is no point to doing data processing if you can’t believe the data that are being operated on. Central to the success of believability of data is the concept known as the “system of record.” The system of record is the place where the one and only believable version of data resides. The system of record must be carefully constructed and protected.

Keywords

System of record; Cycle of awareness; End state architecture; Data lake; Data mart; Data warehouse; Flow of data; Auditing; Text

End users throughout time have gone through a predictable cycle when it comes to the awareness of the end user to the computer. This cycle of end user awareness is as old as the day the first computer was built, and this cycle appears and reappears in many forms. The first part of the cycle starts when the end user first encounters the computer, saying “I want my data.”

The End User Cycle of Awareness

Fig. 15.1.1 shows the first thing the end user wants.

Fig. 15.1.1
Fig. 15.1.1 Step 1—end user cycle of awareness.

The end user's instincts tell him/her that the most important thing is to get the data for whatever application the end user is building. But it takes a long time to get the data. So, the end user says wait a minute. I don’t just want my data; I want it now. I don’t want to have to wait all day for my data. Give them to me quickly.

Fig. 15.1.2 shows this next step in the end user cycle.

Fig. 15.1.2
Fig. 15.1.2 Step 2—end user cycle of awareness.

So, not only does the second step get the end user his/her data, but also the second step in the cycle speeds up the amount of time it takes to get the data to the end user. So now, the end user gets the data quickly.

As soon as the end user discovers that it is possible to get the data quickly, the end user now looks at the data. The end user decides the data can be simplified and organized differently. The data can be visualized. So now, the end user wants the data to be convenient to get and organized and visualized.

Step 3 shows this part of the end user awareness cycle (Fig. 15.1.3).

Fig. 15.1.3
Fig. 15.1.3 Step 3—end user cycle of awareness.

After the end user starts to get their data quickly and conveniently and simplified and organized, the end user then starts to pay attention to the accuracy and believability of the data. At this point, the end user discovers that he/she has been given incorrect data. The data are worthless because they are wrong. In a way, the data at this point are a liability. It LOOKS LIKE the right data; it is just incorrect. If one were to believe the data, the wrong business decision could be made. And it takes a concerted effort to find out that the data are incorrect.

Now, the end user adds one more parameter to his/her awareness. Now, the end user wants accurate, believable data. Step 4 in the end user awareness cycle looks like (Fig. 15.1.4).

Fig. 15.1.4
Fig. 15.1.4 Step 4—end user cycle of awareness.

The System of Record

In order to illustrate the insidious nature of step 4, suppose a spreadsheet is created with people's salaries on it. Anyone can create a spreadsheet, and you can put any data you want into the spreadsheet. Now, on the spreadsheet I put an entry for Bill Inmon's salary. I put into the spreadsheet that Bill Inmon makes $1,000,000 a month.

The spreadsheet looks good. It comes from the computer. It has a lot of information on the spreadsheet. The salaries all look to be OK. However, when we come to Bill Inmon, the spreadsheet says that Bill Inmon makes $1,000,000 a month. That information is inaccurate. If management were to act on this information, they might make some very incorrect conclusions about Bill Inmon, because in fact, Bill Inmon DOES NOT make a million dollars a month.

When the discovery is made that the data are incorrect, the end user has just discovered the need for what is known as the “system of record.

The system of record in computer systems is the designated guarantee that the data that have been accessed are certified—guaranteed—to be accurate. It is possible for there to be errors in the data found in the system of record. But if there are errors in the system of record, the errors have arrived there by means of passing through rigorous audits and checks. Stated differently, the system of record is the best data that are available, and every effort possible has been made to insure the accuracy of the data. If there are errors in the system of record, there aren’t many, and those errors that are found are subject to correction when found to be inaccurate.

Fig. 15.1.5 shows the system of record data.

Fig. 15.1.5
Fig. 15.1.5 Data integrity.

The System of Record in the End State Architecture

The system of record is a living organism that is found in different places in the end-state architecture. Fig. 15.1.6 shows a simplified version of the end-state architecture.

Fig. 15.1.6
Fig. 15.1.6 Simplified end state architecture.

While there are more components to the end-state architecture than those shown in Fig. 15.1.6, the major components of the architecture are depicted. The system of record exists throughout the different places in the end-state architecture.

Fig. 15.1.7 shows the system of record in the different components of the end-state architecture.

Fig. 15.1.7
Fig. 15.1.7 Different structures of data.

Fig. 15.1.7 shows that part of the system of record exists in the application environment, part of the system of record exists in the data warehouse, part of the system of record exists in the data marts, and part of the system of record exists in the big data environment.

The Role of Age in the System of Record

The type of data in the system of record that exist in the different environments depends entirely on the age of the data in the system of record.

Fig. 15.1.8 shows the different types of data that exist in the different places.

Fig. 15.1.8
Fig. 15.1.8 The system of record changes location as data ages.

In the application environment are found the current value data. In current value data, data are accurate as of the moment of access. Only a limited amount of history is found in the application environment.

In the data warehouse are found early historic data. For most organizations, early historic data are data that are from 1 to 5 years old.

In the data mart environment is found departmental customized system of record data. In the departmental customized data are found data that are customized for each department, such as marketing, sales, and finance.

In the big data environment are found deep historic data. Deep historic data are system of record data that is 6 years and older data.

A Simple Example

As a simple example of system of record data, suppose you want to find out what your account balance is right now. You go to the application environment to find out what your account balance is. Now, you are doing your income taxes, and you need to find a check that you wrote 13 months ago. You go to the data warehouse to find that check.

Now, suppose you want to find a marketing analysis of your account along with other similar accounts. You look to the system of record in the marketing data mart. Now, suppose you are being audited by the IRS and you need to go find a check that was written 10 years ago. You go to the system of record in big data.

At every point along the line, you can find reliable, accurate data by looking in the system of record.

The Flow of Data in the System of Record

When looking at the mapping of the system of record to the end-state architecture, it is seen that there is a flow of data from one component to another. In some cases, the data simply flow from one component to another. For example, data flow from the data warehouse to big data. But in other cases, the flow of data takes place in the form of a transformation. Data within the system of record are transformed as they move from the application component to the data warehouse component, and data are transformed as they move from the data warehouse to the data mart.

Fig. 15.1.9 shows that transformation takes place.

Fig. 15.1.9
Fig. 15.1.9 Transformation.

The transformation of data as they move from the application environment to the data warehouse environment is one where the measurement and definitions of data are altered. As a simple example, data in the application environment are measured in inches, while data in the data warehouse are measured in centimeters. As data pass from the application environment to the data warehouse, the calculation of converting data from inches to centimeters is made.

The transformation of data from the data warehouse to the data mart is a different kind of transformation. Typically, this transformation is made in the selection and calculation of data. As a simple example, the data warehouse holds data about all customers. Only the customers from Missouri are selected and then only customers who have spent more than $1000 a month. Then, data about these selected customers are then added together and placed in the data mart.

Other Data Than the System of Record

Another interesting point is as follows: is there data in the application environment or the data warehouse or the big data environment that is not part of the system of record? The answer is absolutely yes—there are data in those places that are not part of the system of record. And it is perfectly all right to access and use that data. However, for extreme confidence in the data, data from the system of record should be chosen.

To use an analogy. Suppose you wanted to enter a car race. You have two choices—drive a Porsche or drive a Volkswagen. Your chances at winning the race are probably improved by choosing the Porsche. But you can choose the Volkswagen and enter the race. And who is to say that you might not win with the Volkswagen. However, to improve your odds of success, you probably would be better off choosing the Porsche.

Fig. 15.1.10 shows the data outside the system of record.

Fig. 15.1.10
Fig. 15.1.10 Data outside the system of record.

Is Data Updated in the System of Record?

Another interesting issue arises—can data inside the system of record be updated? The answer is yes—of course, it can. However, there are some considerations when it comes to the issue of update.

Suppose you have a bank account. Suppose you go and look at your bank account balance as of 10:13 am. Then suppose at 10:45 am, you make a withdrawal from the bank account. The withdrawal is transacted immediately. When you reference the amount of money you have in your bank account, the value can change on a moment by moment basis. Therefore, you have to reference not only the amount of money in the bank account but also the moment in time the amount was accurate at.

So, update can be done in the application environment.

The data warehouse environment is different. New records can be placed in the data warehouse. For example, there may be a record of your daily activity. As of July 15, you had $nnnn dollars in your account. And as of August 3, you had $yyyy dollars in your account. So, data in the data warehouse are constantly being updated. However, a historical record is kept in the data warehouse as of the changing values over time.

Fig. 15.1.11 shows that data in the system of record can certainly be updated.

Fig. 15.1.11
Fig. 15.1.11 Can data inside the system of record be updated?

Detailed and Summary Data in the System of Record

Another important issue is whether the system of record can hold both detailed and summary data. Of course, the system of record can hold primitive, detailed data. There is no question about that. But the real question is whether the system of record can hold summary data.

The answer is yes—of course, the system of record can hold summary data, as seen in Fig. 15.1.12.

Fig. 15.1.12
Fig. 15.1.12 What about detailed and summary data?

However, when summary data are held in the system of record, there is an extra, compulsory component not seen elsewhere. When summary data are held in the system of record, documentation of how the summary was made must be included as well. The documentation of the summarization needs to include at least the following:

  • What data were included in the summarization?
  • What data were excluded from the summarization?
  • When was the summarization made?
  • What formula was used for the summarization?
  • What program(s) made the summarization?
  • Where were the results of the summarization sent?

Fig. 15.1.13 shows that summarization in the system of record requires special documentation.

Fig. 15.1.13
Fig. 15.1.13 The role of documentation.

Auditing Data and the System of Record

The system of record is useful for many purposes. The primary purpose of the system of record is to establish a foundation for the making of business decisions with confidence.

It goes without saying that data found in the system of record are ideal for the purpose of auditing. Conversely, it would be very dangerous (and probably misleading) to conduct an audit of data outside the system of record.

Fig. 15.1.14 shows that the system of record supports auditing.

Fig. 15.1.14
Fig. 15.1.14 Auditing.

Text and the System of Record

An interesting issue is as follows: where does text fit into the system of record? The answer is that any text used for entry into the corporate decision-making environment becomes part of the system of record. Text is a special case of data. Text cannot be changed once the author has written the text. It is legally and ethically improper to take a written document and alter the document. For that reason, text that is used in decision-making becomes an essential part of the system of record.

This applies EVEN IF THE TEXT IS NOT CORRECT. Suppose someone writes “Bill Inmon makes a million dollars a month.” Certainly, the information portrayed in the text is incorrect. But of that is what the author wrote, then the text cannot be changed, even though the text conveys an incorrect idea.

Of course, there is all sorts of text that is written that is never used in the corporate decision-making process. This text is NOT part of the system of record. Only text used for inclusion into the database infrastructure is used for the system of record.

Fig. 15.1.15 shows the relationship of text in the system of record.

Fig. 15.1.15
Fig. 15.1.15 Where does text fit?
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset