This chapter provides an overview from DAMA-DMBOK2 on data warehousing and business intelligence (excerpted from pages 381-384), and then covers the additional data warehousing and business intelligence responsibilities needed for blockchain to work well within our organizations.

Overview from DAMA-DMBOK2

The concept of the data warehouse emerged in the 1980s as technology enabled organizations to integrate data from a range of sources into a common data model. Integrated data promised to provide insight into operational processes and open up new possibilities for leveraging data to make decisions and create organizational value. As importantly, data warehouses were seen as a means to reduce the proliferation of decision support systems (DSS), most of which drew on the same core enterprise data. The concept of an enterprise warehouse promised a way to reduce data redundancy, improve the consistency of information, and enable an enterprise to use its data to make better decisions.

In the 1990s, we began to build data warehouses in earnest. Since then (and especially with the co-evolution of business intelligence as a primary driver of business decision-making), data warehouses have become “mainstream.” Most enterprises have data warehouses; warehousing is the recognized core of enterprise data management.32 Even though well-established, the data warehouse continues to evolve. As new forms of data are created with increasing velocity, new concepts (like data lakes) are constantly emerging that will influence the future of the data warehouse.

The primary driver for data warehousing is to support operational functions, compliance requirements, and Business Intelligence (BI) activities (though not all BI activities depend on warehouse data). Increasingly, organizations are asked to provide data as evidence that they have complied with regulatory requirements. Because they contain historical data, warehouses are often the means to respond to such requests. Nevertheless, business intelligence support continues to be the primary reason for a warehouse. BI promises insight about the organization, its customers, and its products. An organization that acts on knowledge gained from BI can improve operational efficiency and competitive advantage. As more data has become available at a greater velocity, BI has evolved from retrospective assessment to predictive analytics.

The term Business Intelligence (BI) has two meanings. First, it refers to a type of data analysis aimed at understanding organizational activities and opportunities. Results of such analysis are used to improve organizational success. When people say that data holds the key to competitive advantage, they are articulating the promise inherent in business intelligence activity: that if an organization asks the right questions of its own data, it can gain insights (about its products, services, and customers) that enable it to make better decisions about how to fulfill its strategic objectives.

Secondly, business intelligence refers to a set of technologies that support this kind of data analysis. An evolution of decision support tools, BI tools enable querying, data mining, statistical analysis, reporting, scenario modeling, data visualization, and dashboarding. They are used for everything from budgeting to advanced analytics.

A Data Warehouse (DW) is a combination of two primary components: An integrated decision support database and the related software programs used to collect, cleanse, transform, and store data from a variety of operational and external sources. To support historical, analytical, and BI requirements, a data warehouse may also include dependent data marts, which are subset copies of data from the warehouse. In its broadest context, a data warehouse includes any data stores or extracts used to support the delivery of data for BI purposes.

Data Warehousing describes the operational extract, cleansing, transformation, control, and load processes that maintain the data in a data warehouse. The data warehousing process focuses on enabling an integrated and historical business context on operational data by enforcing business rules and maintaining appropriate business data relationships.

Additional responsibilities due to blockchain

The data warehouse project team will find additional challenges with extracting data from and loading data to the blockchain ledger.

Removing the hub

I was a data architect for many years on a data warehouse team, and my daily goal was to do whatever I could do to centralize the data and have one trusted point for reporting.

When thinking about this centralized hub structure, I picture a bicycle wheel:

All data is centralized in one database and all interfaces pass data to or extract data from this hub.

With blockchain, instead of a traditional hub and spoke architecture for applications, we have a completely decentralized architecture. Recall this figure:

If the data warehouse is built using blockchain, the data warehouse project team will need to work closely with data governance to ensure that the data is understood, and its lineage is accurate. In addition, future analytics uses need to be understood and documented.

Extracting (the “E” in ETL)

ETL stands for Extract, Transform, and Load; this is the process of extracting data from a source system, transforming it into something useful for business intelligence, and loading it into a data warehouse for reporting. ETL from or to a blockchain application could be complex.

Similar to extracting data from any NoSQL database, developers will need to learn how to parse a ledger and integrate it with the rest of the data in the data warehouse.

An important challenge will be understanding the mapping between private and public keys and ensuring the private key is not stored in the data warehouse (or if it is stored there, it is protected well). Coordination with the security team will be required.

Broadening scope of the data warehouse

A common theme in the challenges across the data management disciplines is the broadening of application scope beyond the organization. For data warehousing and business intelligence, it is extremely challenging to design and report on structures that cross departments and functional boundaries. Blockchain applications can take it a step further, requiring design and reporting across organizations. It is possible to do this, but extremely challenging; it demands heavy reliance on master data and reference data standards.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset