Implementing a Data Quality Model

,

Data quality is likely the single most important reason why companies tackle MDM. Trusted data delivered in a timely manner is any company's ultimate objective.

However, there are many aspects of data quality, including, among other things, the source of bad data as well as the actual definition of what bad data really is and its associated representation. Let's take a look at a couple examples to illustrate this further:

  • There is no contention about what the two acceptable denominations are for the attribute gender. However, one system may represent it as M/F, another with 1/0, another with Male/Female/MALE/FEMALE, another without any validation whatsoever, with a multitude of manually entered values that could be missing, correct, or incorrect. Furthermore, it is possible to have a correct gender value improperly assigned to a given person. In the end, this information can be critical to companies selling gender-specific products. Others, however, may not be impacted so much if a direct mail letter is incorrectly labeled Mr. instead of Mrs.
  • Some data elements may not even have an obvious definition, or its definition is dependent on another element. An expiration date, for example, has to be a valid date in the calendar as well as later than an effective date. In another scenario, some customers are eligible for a certain service discount only if they have a gold account.

The previous examples show just one facet of data quality or lack of it. Data suffers from a multitude of problems including fragmentation, duplication, business rule violation, lack of standardization, incompleteness, categorization, cataloging, synchronization, missing lineage, and deficient metadata documentation.

One may wonder why companies get in such a mess. It is caused by a multitude of factors, some potentially more avoidable than others.

Certain companies grow at an incredible and sometimes unpredictable pace. Mergers and acquisitions are very common vehicles to increase market share or to tap into new business endeavors. Every time a new company is acquired, it is necessary to integrate its data. That means more quality lacking data is added to the pile. As discussed in Chapter 3, companies usually don't have time to cleanse the new data coming in as part of the migration process, except when it is absolutely required to make them fit into the existing structure. We'll cleanse the data later is a common motto and rarely achieved.

Additionally, software applications have historically been developed to solve a particular aspect of the business problem. That has led to years of multiple distributed software and business applications with disparate rules. Different systems might contain multiple instances of a customer record with different details and transactions linked to it. Because of these reasons, most companies suffer from fragmented and inconsistent data. The net effect is that companies face unnecessary and increased operational inefficiencies, inconsistent or inaccurate reporting, and ultimately incorrect business decisions. Even enterprise applications, such as ERP and CRM, have remained silos of information, with data being constantly duplicated and laden with errors.

There is also the business process as part of the equation. It is human nature for people to look for creative ways to solve their problems. That means when users have technical problems or run into business limitations during data entry, they will find ways to do it, even if it means breaking business rules or overriding well-defined processes. From a data quality perspective, this is not a good thing, but should the users be blamed? After all, they may be facing a particular customer need that doesn't fit an existing business process, or a system bug that is delaying a high-profit transaction.

Let's assume a company does have all the correct elements in place, such as data governance, data stewardship, data quality, IT support, and so on. Users are less likely to engage the proper teams if their confidence in the support process is low. They may think: “By the time I get this problem resolved through the proper mechanisms, I'll have a customer satisfaction issue beyond repair.” Therefore, for the benefit of the company, they act with imagination and solve the immediate problem with non-approved solutions. Making matters worse, these out-of-spec practices and associated data issues are usually difficult to monitor, detect, and correct.

With that said, the primary goal of a company should be not only to have the proper elements of a well-governed entity, but have them working effectively, as well. This comes with maturity, and a constant focus on process improvement. Simply improving the data entry process alone is not enough. It is necessary to improve the support process around it. Just about everything is constantly changing: business needs, business landscape, technology, people, and so on. The only hope is to have an efficiently adaptive model that in spite of all these changes can continue to deliver results quickly.

The bottom line: Data quality is both a technical and a business issue and it requires tackling all three elements of the people/process/technology triangle. From a people standpoint, it necessitates the combined effort from both business and IT when addressing the myriad of quality issues throughout the enterprise. From a process perspective, it also demands IT and business to engage in a truly collaborative effort, without so much of the commonly seen politics and obstacles generally imposed in their relationships. Finally, technology, when applied properly, can expedite the problem resolution as well as make it viable to establish a mature and repeatable process.

This chapter discusses in detail the need for business and IT engagement. To start, a data quality process is presented with the objective to show how to connect the many players inside the enterprise into an overarching and repeatable methodology to foster recognition and action upon data-driven issues.

Described later in this chapter is a methodology to assess the current state of the quality of the data and establish a baseline to define the need for immediate actions as well as to gauge future improvements.

When reading through this chapter, it is important to take into account the type of MDM approach being implemented. For the most part, the discussions will assume an enterprise MDM solution since that is the most encompassing of them all. Therefore, adjustments will be necessary when implementing a different solution. For example, chances are an analytical MDM will not impact as many LOBs in the company as an operational or enterprise MDM, which may lead to a different decision-making process from that which is presented here. However, readers should be able to make the proper adjustments according to their particular situations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset