This chapter provides an overview from DAMA-DMBOK2 on data quality (excerpted from pages 449-452), and then covers the additional data quality responsibilities needed for blockchain to work well within our organizations.

Overview from DAMA-DMBOK2

Effective data management involves a set of complex, interrelated processes that enable an organization to use its data to achieve strategic goals. Data management includes the ability to design data for applications, store and access it securely, share it appropriately, learn from it, and ensure it meets business needs. One assumption underlying assertions about the value of data is that the data itself is reliable and trustworthy (which we describe as being “high-quality”).

However, many factors can undermine that assumption by contributing to poor quality data: lack of understanding about the effects of poor quality data on organizational success, bad planning, ‘siloed’ system design, inconsistent development processes, incomplete documentation, a lack of standards, or a lack of governance. Many organizations fail to define what makes data fit for purpose.

All data management disciplines contribute to the quality of data, and high-quality data that supports the organization should be the goal of all data management disciplines. Because uninformed decisions or actions by anyone who interacts with data can result in poor data quality, producing high-quality data requires cross-functional commitment and coordination. Organizations and teams should be aware of this, and should plan for high-quality data, by executing processes and projects in ways that account for risk related to unexpected or unacceptable conditions in the data.

Because no organization has perfect business processes, perfect technical processes, or perfect data management practices, all organizations experience problems related to the quality of their data. Organizations that formally manage the quality of data have fewer problems than those that leave data quality to chance.

Formal data quality management is similar to continuous quality management for other products. It includes managing data through its lifecycle by setting standards; building quality into the processes that create, transform, and store data; and measuring data against standards.

Managing data to this level usually requires a data quality program team. The data quality program team is responsible for engaging both business and technical data management professionals. They also drive the process of applying quality management techniques to data, to ensure that data is fit for consumption for a variety of purposes. The team will likely be involved with a series of projects, through which they can establish processes and best practices while addressing high-priority data issues.

Because managing the quality of data involves managing the data lifecycle, a data quality program will also have operational responsibilities related to data usage. For example, these responsibilities can include reporting on data quality levels and engaging in the analysis, quantification, and prioritization of data issues.

The team is also responsible for working with those who need data to do their jobs, ensuring that the data meets their needs. The team will also work with those who create, update, or delete data in the course of their jobs, to ensure they are properly handling the data. Data quality depends on all who interact with the data—not just data management professionals.

Additional responsibilities due to blockchain

Data quality experts will need to work closely with the data governance team, to clearly determine what data quality will mean in a blockchain environment.

Fixing data quality in immutable data

Once data is written to a blockchain, it is there forever. So what if the data was written with a mistake? She ordered a coffee without sugar but it came with sugar. He mistyped the product category. He was born in the 22nd century. How do we handle such data quality issues when the data cannot be changed?

Data quality experts must work with data governance to put rules and processes in place to detect and denote a data quality issue, as opposed to having the data quality issue fixed by correcting the data.

Redefining data quality

The bounds of “data quality” might need to be extended to include attempts to exploit the system. Recall our example of someone buying 100 shares of IBM but tricking the system by not paying for those shares. For those recordkeepers that are misled and store this erroneous information, should this be considered a data quality issue or a security issue, or both?

Just like when fixing any data quality issues, data quality experts will need to work closely with data governance to define rules and guidelines.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset