Chapter 12

Continuous Improvement

Abstract

This chapter covers the need and opportunities for continuous improvement in a multi-domain Master Data Management (MDM) program. It discusses the reasons that MDM and governance maturity can run into problems, what are maturity inhibitors, and how to achieve or regain momentum. In addition, it mentions that improvement targets should relate to the program roadmap and be part of annual budget-planning reviews, and discusses how multi-domain MDM practices can be applied more generically across data management practices.

Keywords

Improvement

maturity

planning

momentum

change

goal

target

roadmap

governance

This chapter discusses the need and opportunities for continuous improvement in a multi-domain Master Data Management (MDM) program. While not always addressed as a specific program initiative within a company, continuous improvement is an implied ongoing focus expressed in the MDM program’s roadmap and maturity goals. Continuous improvement requires many enabling factors involving people, process, and technology that need to be examined across each MDM discipline to determine what improvement opportunities can most benefit the program and where lack of improvement will be an inhibiting factor. Without comprehension of and foresight about continuous improvement needs, an MDM program’s momentum can slow down or even disappear altogether if certain enabling factors, needed capabilities, and improvement opportunities are not achieved because they were not forecast and positioned as critical dependencies and program success factors.

Continuous improvement cannot just be one-time recommendations waiting for positioning and recognition. It needs to represent an ongoing focus that can be translated into reasonable and applicable improvement targets needed to support the program’s sustainability and maturity. These targets need to be clearly related to the program roadmap and part of annual budget-planning decisions. Chapter 5 discussed how to define and establish a cross-domain maturity model; this chapter addresses how continuous improvement focus throughout the MDM program is a key enabling factor for achieving these roadmap and maturity goals.

In order for the MDM domains and the overall program to move forward in the maturity model, continuous improvement is assumed, but not a given. Achieving many of the maturity milestones will require specific capabilities and process improvements to occur. These capabilities and improvement needs should be examined in relation to each of the five MDM disciplines, along with how and where such improvement opportunities exist within each domain. Figure 12.1 is an illustration of the relationship that continuous improvement has with these MDM disciplines and the maturity of the MDM program.

f12-01-9780128008355
Figure 12.1 Continuous improvement in a multi-domain MDM

Improvement opportunities will vary with each domain, but the end goal of the Program Management Office (PMO) is to continue to move each domain forward in the maturity model to eventually achieve a highly manage and optimized state. But let’s first step back to better understand continuous improvement as a general methodology, and then examine continuous improvement opportunities in relation to each of the five MDM discipline areas.

Continuous Improvement in a Nutshell

Many companies already employ some type of method to constantly evaluate and improve their processes, services, and products. They just need to look at expanding those methods to support and mature the functions behind MDM. It is not the intent of this chapter to specifically describe or recommend any particular continuous improvement approach. Rather, continuous improvement should be understood as a general concept to apply to the MDM program scope, regardless of the specific method used.

There are semantic differences between the terms continuous improvement and continual improvement. Using the more particular definition, continuous improvement presupposes a more linear and incremental improvement within an existing area. Conversely, continual improvement is broader, encompassing multiple continuous improvement programs. For the purpose of this book, those two terms are used interchangeably to mean the broader scope, where companies are constantly seeking incremental improvement over time or breakthrough improvements to many processes, services, and products in many areas.

Among the most widely used tools for continuous improvement is a four-step quality model—the plan-do-check-act (PDCA) cycle, also known as the Deming Cycle or Shewhart Cycle, which steps can be summarized as follows:

1. Plan: Identify an opportunity and plan for change.

2. Do: Implement the change on a small scale.

3. Check: Use data to analyze the results of the change and determine whether it made a difference.

4. Act: If the change was successful, implement it on a wider scale and continuously assess your results. If the change did not work, begin the cycle again.

Other widely used methods of continuous improvement—such as Six Sigma, Lean Six Sigma, and Total Quality Management—emphasize employee involvement and teamwork; measuring and systematizing processes; and reducing variation, defects, and cycle times.

The use of a defined approach along with some consulting services can significantly aid in the preparation and analysis work needed with continuous improvement initiatives. In other words, often the MDM program team itself will not have sufficient bandwidth to conduct an effective or objective continuous improvement assessment. With a good approach, and after an initial assessment is conducted, the format and approach can be reused and updated each year to provide year-to-year continuity with improvement evaluation. Also, if a company has an internal audit department, there may be some opportunities to apply that process each year to evaluating program progress and improvement.

Let’s take a closer look at continuous improvement for multi-domain MDM.

Continuous Improvement in Multi-Domain MDM

Continuous improvement for multi-domain MDM is multifaceted. There is the natural expansion related to incrementally adding additional domains to an MDM hub, which, due to the many variables within each domain, cause both leading and lagging conditions. At a minimum, an annual assessment should be conducted to align requirements, roadmap, priority, and maturity-tracking processes.

But in addition to continually expanding the management of master data for more and more domains, or even expanding the number of master attributes for a given domain, it is also important to apply continuous improvement to the five primary focus areas surrounding MDM: data governance, data stewardship, data integration, Data Quality Management (DQM), and metadata management. This is covered in the next sections.

Continuous Improvement in Data Governance

In many respects, the definition and charter of data governance reflect a continuous improvement focus in itself. In Chapter 6, this definition of data governance was presented (from DMBOK 2010):

Data governance is the exercise of authority and control (planning, monitoring, and enforcement) over the management of data assets.

We know that applying and achieving a successful degree of data governance requires a long-term commitment to execute a governance framework that can drive policies, standards, and decisions aimed at the effective management and quality control of data assets. A data governance charter typically has a broader focus than just master data. Data governance authority will usually span many types of data and various data environments. A data governance program will often have its own maturity model that covers the broader scope. Throughout this book, we have focused on the relationship and alignment needed between data governance and an MDM program. This also assumes alignment of continuous improvement efforts.

Because of the broad use and importance of master data with a company, master data improvement needs are likely to command more priority and attention during the data governance process than other improvement needs for other types of data. The MDM PMO owns much of the responsibility for defining and reviewing these improvement needs with the data governance process. From a MDM program perspective, continuous improvement should in many respects align with broader data management improvement goals that an enterprise data governance program also has in scope. In fact, effective governance of master data may be the single most important objective of an enterprise data governance scope. For example, in Chapter 5, these data governance–oriented milestones were presented as part of an MDM program’s maturity model:

 Domain-based data governance charters, owners, teams, and the decision authority have been identified.

 The master data associated with the domains have been identified and approved by data governance from an owning and using perspective.

 Data management policies and standards have been defined and implemented.

 Measurements and dashboards are in place to measure and drive master data quality and control.

Achieving each of these milestones will not only improve the management and control of master data, but also greatly contribute to improving the roles and effectiveness of data governance as a whole across the company. The ability to align and organize data governance focus with master data domains to improve MDM and quality-control focus will provide the foundation and capabilities that can be extended to other types of data and data governance needs.

What this is suggesting is that continuous improvement of the data governance discipline in the MDM program will be a highly influencing factor for the recognition and success of a broader enterprise data governance program. If master data governance is inadequate, the perception of data governance in general will be negatively affected. Therefore, the MDM PMO and a data governance council need to share common direction and continuous improvement goals. Here are a few specific recommendations of where the MDM PMO and a data governance council can collaborate to drive continuous improvement that supports both the MDM program and enterprise data governance objectives:

 Help sponsor and support an Enterprise Data Governance model. The MDM PMO should have an active and influential role in a data governance council to help build a comprehensive enterprise data governance model where MDM data governance needs can feed into. Often, a data governance initiative starts from requirements associated with a specific program or functional area then tries to expand outward or upward from there. A good practice in one area can certainly get attention in another area where similar data governance interest exists, but without a more formal enterprisewide program charter and a broad level of executive sponsorship, not all business areas will respond because data governance is not part of their business plan and priorities. In this situation, developing an MDM and data governance footprint across multiple domains will be very difficult and will certainly inhibit the ability of a multi-domain MDM program to gain sufficient penetration into domain areas where data governance does not have sponsorship. Therefore, the ability to influence and help support enterprise data governance growth should be a key aspect of a MDM program’s continuous improvement plans.

 Ensure an ongoing alignment of the data domain definition. Be sure that the data domain definition is aligned or has a clear mapping across MDM, data governance, and enterprise data architecture models. Chapter 2 indicated that data domain definition can vary depending on the company’s industry orientation and business model. Within a company, the definition of data domains and data subject areas can be different or conflicting across a company’s information, system, and functional architectures if there are no standards that apply to this. The MDM PMO and data governance program should work with data architect teams to agree on how data domain definitions and structure should align. Creating an aligned and commonly recognized data domain structure will greatly simplify the ability to focus data management, data governance, and data-quality-improvement initiatives.

 Ensure that data governance maturity milestones are actionable and achievable. Where MDM program maturity milestones represent data governance capability improvement expectations, be sure that these milestones align to data governance council or other steering committee priorities and budget-planning activities that will influence or affect the achievement of the milestone. For example, if a key data governance milestone is to ensure that data quality measurements and dashboards are in place to measure and drive master data quality and control, the MDM PMO needs to be engaged with the information technology (IT) and data governance planning activities, where decisions will occur about the technology and support capabilities needed for data-quality measurement and reporting. Having an influence on data governance investments and delivery of needed capabilities will help enable the MDM program meet improvement targets that advance the program’s maturity.

 Forge a strong, collaborative relationship with corporate functions. The MDM PMO needs to maintain a strong relationship with corporate functions such as Legal, Compliance, Information Security, and Human Resources (HR) to regularly evaluate corporate issues and risk factors that the MDM program can assist with. Aside from being engaged in or responsible for specific issue mitigation activities where master data quality and management issues are involved, the MDM program can proactively work with these corporate functions on other general data governance or risk avoidance opportunities, such as alignment of data management policies, employee training, data steward support, and monitoring of conditions that can create risk and compliance issues. From a MDM maturity perspective, these are collaborative opportunities that will directly contribute to achieving a highly managed and optimized state.

As stated previously, continuous improvement cannot just be one-time recommendations waiting for positioning and recognition. It needs to reflect an ongoing focus that can be translated into reasonable and applicable improvement targets that a data-governance program will need to support the MDM program’s sustainability and maturity.

Continuous Improvement in Data Stewardship

The build-out and positioning of data steward roles and practices may be the most challenging components of an MDM program. Continuous improvement in data stewardship is a constant factor in the MDM maturity model. In a multi-domain MDM model, data stewardship largely relies on the enlistment of personnel acting as data and process area experts who can truly embrace the data steward concept and focus on specific data governance and data management initiatives. The difficulty with this approach is that a data steward role is often not a formally defined role within a company. In such cases, data steward enlistment, positioning, and recognition can be affected when that role overlaps or conflicts with other roles and titles and there is no visible job ladder or career path for a data steward within the company. As was pointed out in Chapter 7, the purpose and function of data stewards are vital to maintaining good, trusted master data. Therefore, the data steward model and any improvement opportunities need continual evaluation to ensure that the MDM program goals are achieved. Here are some recommendations for how to keep improving data stewardship across the MDM program:

 Create recognition and rewards. People who are doing a good job and meeting goals in a data steward function need to receive recognition and rewards. Whether these employees are participating in a data steward team that provides perspective or makes decisions about important MDM concepts, issues, and solutions, or performing very specific data management or quality control tasks, the PMO and data governance leaders should coordinate on offering recognition and reward opportunities for them. This helps retain the focus and talent needed to fill these roles effectively, and from an MDM program maturity perspective, reflect a clear intent to build and maintain a high-quality culture.

 Identify the enabling and constraining factors in a data steward role. Like almost every job role, data stewards can be happy about where they are positioned but frustrated by the process and tools that keep them from being as successful and effective as they would like to be. In most cases, the person in the data steward role has some insight or recommendations for how their processes and tools could be improved. The domain data governance teams and lead data stewards should periodically assess their data steward processes to identify improvement needs and opportunities to make improvements. Obviously, not all process or tool improvements are feasible, but simply having a continuous improvement focus that results in at least some beneficial improvement will help increase data steward effectiveness and job satisfaction.

 Keep pursuing opportunity to make “data steward” a formal job title. If “data steward” is not a formal job title, but there are roles that resemble the data steward support role, and there are growing recognition and evidence of the value of these roles, the MDM PMO should keep demonstrating its appreciation. This should be demonstrated to the program steering committee, governance council, and HR leadership. If the appreciation is there and consistently demonstrated, executive leaders will recognize the need to support the creation of an actual data steward position as a necessary component of the enterprise data management strategy and goals. As there is a growing focus in data management on the data steward role and career path, being able to formally post open data steward positions will greatly improve a company’s ability to target and recruit qualified candidates. Having formal data steward job titles, levels, and salary ranges will enable the MDM PMO, data governance program, or both to more specifically address resource and budget forecasting for program improvement needs.

As was also pointed out in Chapter 7, data stewardship should be a major component throughout all of MDM. Therefore, a multi-domain plan needs to develop a firm concept of how a data steward will look and function, where the data steward role will need to be best positioned, and how the right resources can be identified and engaged for optimum performance. As the program evolves, so should the data steward model and focus. Data stewards are critical to the improvement of master data quality and control, so the program’s continuous improvement plans need to ensure that data steward needs and capabilities are factored in.

Continuous Improvement in Data Integration

Data integration involving master data can happen on many different phases and levels. And if a company is expanding, data integration plans and activities may also have to expand over time, such as in the following ways:

 Manage more master data from additional domains. It is very common for companies to start with one or two domains and add more as their MDM practice matures. This can potentially require the integration of additional sources, and consequently the possibility of different technologies and processing type (batch versus real time). It might also require the integration of additional external sources for reference data management.

 Integrate more sources feeding data into an MDM hub. Companies might choose to minimize risk by integrating only a couple of sources at a time. Instead of integrating master data for a particular domain from all existing sources, it might be more advantageous to start with one or two data sources, and in incremental phases, add more as needed. Of course, the actual MDM hub style and architecture chosen will have a direct impact on this. The registry-style hub, for example, does not affect existing data sources, but it obviously requires a data integration effort in the hub itself. Nevertheless, in addition to the typical technical challenges of data integration, it is necessary to review the roadmap, priority, and viability of integrating more sources.

 Integrate more data sources due to mergers and acquisitions. Many companies grow by acquiring other companies. This represents a great data integration challenge because the acquired company might have completely different systems, differing master data attributes and definitions, uneven levels of data quality, and varying levels of existing integration. Furthermore, the integration of the newly acquired company might have to be completed very quickly, which will certainly add to the challenge. If mergers and acquisitions are common, a company must focus on maturing its data integration practice, and methods of continuous improvement are certainly helpful to accomplish this challenging task.

 Integrate more sources consuming data from the MDM hub. Just as it can be less risky to start with fewer sources feeding master data into the MDM hub, it might make sense to have fewer sources use data from the hub at first. Downstream systems might have to be redesigned and rebuilt to accommodate a newly developed master data model. Depending on the overall architecture and how the MDM hub fits into the existing technological and business roadmap, downstream systems such as operational data stores (ODSs) and enterprise data warehouses (EDWs) will not necessarily be integrated with the MDM hub in its initial deployment.

 Add more attributes to existing master data sets. As more sources are integrated, the higher the probability that additional master attributes will be identified. Even certain master data attributes from previously integrated sources may not be in the initial scope, but now they are important to capture in the MDM hub as well. In either event, when a new master attribute is considered for inclusion into the MDM hub, an upgrade will be necessary to integrate the newly identified attributes. If the new attribute does not affect the identity of the master entity, the change is much easier. If it does affect the identity of the master entity, it will require a revision of the entity resolution process (see the next point).

 Make additions and changes to attributes related to entity resolution. This topic is a special case of the abovementioned point. New attributes are added to the scope of an existing master entity, either because of newly integrated sources or due to increased scope of previously integrated sources. If new attributes affect the identity of an entity or the survivorship rules, it is necessary to review the entity resolution logic. This is certainly more time-consuming than adding other kind of attributes that can be simply appended to the existing ones without affecting clustering and survivorship.

 Upgrade from batch to real-time data updates. Real-time MDM is typically more difficult to implement and maintain than batch-mode MDM due to the complex nature of keeping data in sync at all times. Granted, certain domains might not require real-time integration because business requirements can be met even with delayed processing. But there are cases where companies decide to start with batch mode to get the MDM hub operating quickly and opt to convert to real-time processing as their MDM system matures.

 Change business rules. Businesses evolve and change over time, and so do their data definitions, contexts, regulations, processes, and procedures. Data integration can be affected depending on the extent of those changes.

All these subjects can greatly benefit from a continuous improvement program applied to data integration. An MDM PMO would provide a roadmap for how some of those areas are expected to evolve, but a continuous improvement program might be better equipped to review, reprioritize, plan, and execute eventual changes over time. As always, data governance participation is critical. A data governance program in place can certainly facilitate and expedite steps when business engagement is required to review and approve modifications.

A continuous improvement program can document processes, guidelines, checklists, and other artifacts to increase the maturity of the company with regard to implementing each of the previously mentioned MDM-related data integration requirements. Using the addition of more data sources with time as an example, a well-documented set of guidelines will certainly help the completion of all necessary steps quickly and successfully. If a company goes through frequent mergers and acquisitions, creating a step-by-step recipe for the process would save a lot of time and money. To be sure, a process to revise and update those guidelines should be in place as well.

Continuous Improvement in DQM

MDM is obviously about the management, governance, and quality control of master data. But the charter and scope of data governance and DQM can be greater than the master data scope. Therefore, continuous improvement in data quality will occur beyond the boundaries of the master data program.

Maturity of DQM is a constant focus. Companies should strive to move from being reactive to data-quality issues to being proactive. Error prevention is preferred over data correction, and such practices are likely to be less costly if they can be feasibly implemented. The following sections list areas that require constant innovation and should be the focus of a continuous data quality improvement program.

Data analysis and profiling. Data analysis and profiling can and should be approached methodically. Therefore, it is important to create a repeatable and efficient process to explore, analyze, and profile data sources and potential issues. But a high degree of tailoring is also required when performing certain types of data analysis and profiling due to the uniqueness of some data scenarios. Both methodical and specific aspects of data analysis and profiling are discussed next.

From a methodical point of view, a data-quality program should establish certain standard types of data-quality checks to follow when analyzing a data source. These types of activities are usually more repeatable and require less resource specialization. Examples of elements to check include the following:

 Completeness of key attributes

 Uniqueness of potential primary key candidates

 Highest and lowest values of numeric attributes

 Frequency distribution of certain types of attributes

 Pattern analysis of attributes candidates to standardization

 Data match analysis across sources

More specific data scenarios and types of analysis are likely to require a high level of expertise and tailoring of data profiling techniques to find the root cause of data issues. Certain issues can be very convoluted and entail a very specific analysis that will go above and beyond any preestablished, step-by-step methods. Quite often, this requires great expertise, not only about data analysis techniques, but also about the existing data and their relationships. Improvement in this area must occur by properly training individuals in data analysis techniques, how the business uses data, and the structure of data at their respective sources.

A great enabler of data analysis and profiling is technology. Technology is constantly evolving and should be regularly sought out for improving existing practices, as well as expanding to new topics. For example, technology designed to explore and profile unstructured data is a lot less mature than technology designed to do similar tasks to structured or semistructured data. Therefore, it is important to regularly evaluate what is new in a given area of interest, and whether it is related to a master data domain. A great deal of data quality is required for the governance of multiple types of data, not only master data.

As companies expand their multi-domain MDM disciplines to other subject areas, it is advantageous to research what data-quality tools are available to enhance and expedite data analysis and profiling of any newly added entities. For example, data profiling of customer master data can be quite different from data profiling of product master data. Certain tools might offer better capabilities in one area than another.

Another important aspect of this idea is the usage of reference data to assess and validate the quality of the existing data. Reference data can be very specific to domains and industries. Here are some examples:

 Credit bureau data can be used to validate a person’s identity.

 U.S. Postal Service data can be used to validate U.S. addresses.

 Industry-specific catalogs can be used for validation. For instance, in the automotive industry, vehicle catalogs are vital to efforts to validate and standardize a vehicle’s make, model, year, and trim.

 Catalogs of companies can be used to validate information about a company, including its legal name, industry classification, and company hierarchy and subsidiaries.

Therefore, regularly researching and evaluating reference data offerings are highly recommended for enhancing and accelerating data-quality-maintenance capabilities.

Finally, the ongoing profiling of production MDM data is also important. The intent is to use data profiling to continue to measure and analyze master data attributes to ensure that previously established rules and resultant values are being established according to expectations. Details such as which sources are most frequently contributing new values, which sources are being refused, and what fields are becoming more or less distributed in value can help us evaluate if any adjustments are necessary.

Error prevention and data validation. A lot of times, data-quality issues have a ripple effect, making it difficult to truly measure the total cost of a problem that could have been altogether avoidable if proper measures had been taken to prevent it from happening in the first place. Error prevention is generally the most desirable approach, but clearly that cannot be achieved in all situations. Understanding those situations is important to properly plan a continuous improvement program in this area. Here are some situations to consider:

 Technological limitations may prevent certain real-time data validations to occur due to the inability to collect and present valid options to the user in a timely manner. For example, as users make consecutive selections from multiple interrelated drop-down lists, it might not be viable to access one or more remote systems to dynamically filter invalid values. Data might have to be duplicated locally to allow the timely population of valid options, but this increases maintenance costs and the risk of inconsistencies. Certain system designs and architectures can be more conducive to performing data validation than others. Therefore, as system design and architecture evolve, opportunities to improve error prevention should be sought.

 Reference data for different domains and industries are constantly evolving. Vendors keep advancing their expertise and data offerings in many areas, which can and should be considered when validating data and preventing data issues.

 Mergers and acquisitions require data from one company to be integrated with another. It could entail many systems within a single merger or acquisition. If a company is constantly acquiring other companies, it should work on creating a reliable process to efficiently integrate data, while at the same time validating what is coming in.

 Evolving business rules sometimes require system changes to fulfill new needs. Implementing new error prevention mechanisms may take time. If a company is highly susceptible to these types of changes, it should explore options to improve the speed and efficiency of any change processes associated with them.

Most of the time, there is little dispute that preventing a data-quality issue is the best remedy. But sometimes is not possible to do that due to high associated costs, technological constraints, highly demanding schedules, or any combination. In these situations, it is important to capture the rationale used in the decision process and revisit decisions on a regular basis to evaluate if conditions have changed and implementation has become viable.

Data cleansing or data scrubbing. Recall from Chapter 9 that the terms data cleansing and data scrubbing are often used as a catch-all for all sorts of transformations to data to improve their quality. Of course, a company looking to mature its DQM process will continuously look for opportunities to improve its data. It is true that the requirement to correct a certain piece of information will often be directly stated. However, in some cases, cleansing opportunities will arise as a byproduct of other activities. For example, when a bug affecting data quality is found, it is necessary to fix it, both to correct the existing issue and to prevent it from happening in the future. In another example, if a business rule changes, it is necessary to implement one or more changes to support the new rule, and sometimes to modify any existing data to comply with the new rule.

Therefore, from a continuous improvement point of view, it is important to identify situations when data cleansing can potentially arise as a subsequent or indirect requirement. This is part of being a proactive organization. Furthermore, regular data profiling can also help raise awareness of required data cleansing activities that the business may have not have noticed before. Mature IT organizations are capable of identifying certain types of data issues before the business does. Keep in mind that data quality is not maintained just for its own sake—it needs to fulfill a purpose. And businesses should always have the final word regarding the level of quality required. But an IT organization that can point out potential problems will go a long way to meeting quality standards.

Data standardization. Data are distributed across heterogeneous systems, leading to an ongoing battle to consistently represent the same information in the same format. New sources of information are constantly being integrated into existing ones. To require data elements from every system to conform to a particular standard is too much of a stretch. Therefore, data transformations are unavoidable when data is moving across systems.

This integration typically happens in one of two ways:

 Data are moved permanently from one system to another.

 An interface is created to move data regularly from one system to another.

When data is permanently moved from one system to another, data from the source system should be transformed to conform to the standards required by the target system. When an interface is added to move data from one system to another, the logic to conform the data across them needs to be added as data are moved. Refer to Chapter 9 for more details on this.

From a continuous improvement perspective, companies should tackle data standardization from multiple fronts:

 Constantly evolve data standardization as part of data migration projects. In general, data need to be permanently moved from one system to another due to data consolidation, system upgrades, and mergers and acquisitions. Data migration efforts require a repeatable process that can deliver successful and predictable results. Data standardization should be a permanent item in the list of activities required as part of this exercise, with enough details to correctly guide teams on how to identify and conduct data standardization activities during the migration process. Furthermore, this documentation should be regularly updated as the process matures.

 Companies should start and progress their data standardization efforts at their core systems, prioritized by the data elements that bring the most value to the business. Data governance should assist with the identification and prioritization processes. As standards evolve, they become the foundation of a full-fledged enterprise standard catalog, captured via a metadata management tool and published for general usage.

 Any new interface integrated should be evaluated from a data standardization perspective. Data elements not following a standard should either be corrected or properly defended in terms of why they are exempt from conformity. That could be so for a multitude of reasons, such as cost, technology, and risk. Regularly revisiting those decisions is wise because the landscape is constantly changing.

 Technology and reference data improvements. Many standardization efforts are dropped due to a lack of proper technology or reference sources to reliably standardize certain data elements. As technology matures and reference data are expanded, more can be improved in this area. Regular evaluation should be conducted to identify new opportunities.

Data enrichment. In general, data enrichment is accomplished by integrating trusted reference data sources. For example, integrating D&B allows augmentation of customer information; integrating Vertex allows augmentation of tax ID; integrating the Postal Service allows the adding of 4 digits to the standard 5-digit ZIP code; integrating Chrome allows augmentation of vehicle catalog information; and so on. As stated previously, the number of vendors providing reference data is growing. To continuously improve in this area means to regularly assess what is offered and what areas of the business can be improved by augmented information.

New business opportunities may arise by creatively tapping into these new sources of information. Marketing and sales campaigns can tremendously benefit from additional information about customers and products to improve their predictive analytics and their up-sell and cross-sell techniques.

Data monitoring, scorecards, and dashboards. It is probably easy to see how this category needs continuous improvement. The catalog of data items in need of metrics—monitoring, scorecards, or both—is bound to grow. As companies mature in DQM and become more proactive, the more items they will identify as needing regular assessment. As companies mature their data error prevention practices, one may wonder if they will need less monitoring. Typically, though, it is more likely that more data items will require monitoring as time goes by than that data items will become completely error-free. Even as anomalies are lowered due to better error prevention, chances are they won’t be completely eliminated. Therefore, the number of anomalies will still need to be measured in many cases. However, while the number of items in need of metrics won’t necessarily go down, the number of violations should.

Continuous Improvement in Metadata Management

The collection of new metadata and maintenance of existing metadata is a never-ending activity. The key is to properly prioritize sources where metadata would be most valuable, enhance the collection and maintenance process through automation and process improvement, constantly search for new ways to effectively integrate metadata management within other existing processes, and efficiently distribute metadata for better understanding and usage of information throughout the company.

Metadata management is perhaps the least-explored data management capability within a multi-domain MDM. Because companies often start their MDM plans with just a single domain, the number of master elements can be relatively small, and this may not justify having dedicated metadata management. But as the number of domains increase, there are more master data elements that need control and more data integration activities occurring, and so the need also increases for a more formalized and centralized metadata management approach.

The bottom line is that the core functions of multi-domain MDM—data governance, data stewardship, data integration, data quality, and metadata management—are all critical for the better usage of information. Furthermore, those five key components feed off each other, improving each other’s efficiency and efficacy.

In the next several sections, let’s look at some specific areas of metadata management that can benefit from a constant focus on improvement.

Sources of metadata. Recall from Chapter 10 the many islands of metadata throughout the company, such as internal and external data sources (structured, semistructured, and unstructured), data models, interfaces and transformations, business applications, business definitions and processes, analytics and reporting, correspondence and legal documents, and business and data-quality rules. Collecting business, technical, and operational metadata information related to these sources can tremendously improve the capability to better use these sources of information correctly and effectively. Companies are constantly making incorrect decisions because they lack a full understanding of their data assets.

Completing a full picture of all these sources of metadata is a tenuous process. Depending on the size of a company, each of those islands of metadata can have thousands of data attributes, and documenting all the metadata will take a long time. Therefore, it is important to have an efficient process to recognize what sources would mostly benefit a company if its metadata information is exposed. To prioritize future sources requires understanding of what has worked in the past.

Metadata management can suffer quite a bit of resistance from existing organizations because they cannot quite see an immediate benefit of metadata management to what they are doing, or they see metadata management as an additional overhead to their already understaffed teams. Therefore, forward-thinking companies may start with their metadata management program tied to a data governance office, but first without full engagement from the business, or even from IT application owners. That will require the metadata team to build a case by exploring islands of metadata on their own to show their value. This will result on hit-or-miss scenarios where certain metadata groupings will offer more benefits than others. Of course, that is not an ideal scenario.

The ultimate goal is to have a fully integrated metadata management discipline, with priorities driven by the added value that they bring to the many organizations within the enterprise. A continuous improvement process should make sure that this happens and metadata management does not become a fad. It is important to learn from what has worked in the past and adjust to what has not. Regular evaluations should be conducted to ensure that proper adjustments are made.

Collection and maintenance of metadata. Metadata is data, and as such, it can suffer from typical data-quality issues, such as duplication, inconsistency, incompleteness, and accuracy. It is necessary to realize how important creating a sustainable and reliable process to collect and maintain metadata is. It is not unusual to see thousands of metadata items become practically unusable because they are now obsolete.

Companies capture quite a bit of metadata on a regular basis, with or without a formal metadata management function. But there is a problem: This information is distributed in many forms. Manually captured metadata is typically stored in unstructured or semistructured form, such as Microsoft Word documents or Microsoft Excel spreadsheets. These documents will usually contain business term definitions, data dictionaries, data mappings, business rules, data-quality requirements, and other elements. In addition, metadata also exists in technological components, such as databases, data-modeling tools, extract, transform, and load (ETL) tools, reporting tools, business rule engines, and so on. Examples of metadata in those sources include data models, data types, data structures, attribute names, data mapping, transformations, business rules, and calculations.

A mature company will have a repository that can capture metadata from all these channels and store them in a single and integrated location for easy retrieval. However, the collection of metadata is truly twofold. The first part is the initial collection of metadata; the second is the continual update of what has been collected. If the collection of metadata can be automated, the ongoing activity is simply a repetition of the initial load process for the purpose of regularly refreshing the metadata repository. But if the process is manual, it is necessary to decide if the maintenance after initial load will be done directly into the metadata repository, or if it will continue at the original source and require constant synchronization.

Let’s use an example to illustrate this point. Assume that before a metadata repository tool is acquired, a team of users maintain data definitions and mapping between systems in spreadsheets. Once a tool is acquired, it is only logical to load the metadata from those spreadsheets into the tool. When the metadata is in the repository, the maintenance team needs to decide if it will make changes directly into the metadata repository or continue to use spreadsheets. There are pros and cons to both options. Using spreadsheets is a known process, but it requires regular refreshes of the repository. Using the metadata tool will require training and availability, but the most up-to-date and unique metadata will be available immediately for distribution.

Continuous improvement in collection and maintenance of metadata is about addressing the following main issues:

 Continue to explore tools that can automate the process of collecting metadata. Metadata tools typically offer capabilities to import metadata from certain technologies, but vendors will continuously improve and expand their features. Stay abreast of newly added options.

 Evolve the relationship with producers of metadata. Have them seek to use an enterprise metadata tool in their metadata-collecting process. For metadata management to be mostly effective, companies need to approach it similar to the way that Wikipedia content is created: multiple people contribute content. Sometimes it is necessary to confirm that the content is right. In those cases, an arbitration process can be led by a data governance program.

Metadata within other processes. There needs to be a metadata team to establish the foundation, manage the metadata repository, define standards, provide expert guidance, and ultimately be accountable for the health and proper delivery of metadata throughout the enterprise. But metadata is documentation, and as such, it needs to be close to the experts generating it, and collected within the existing processes that these experts already perform to avoid duplication and rework. For metadata management to be most effective, metadata documentation should be a byproduct of already-existing processes, but captured in a more formal way, following predefined standards and techniques. The resulting metadata can be utilized as a self-feeding artifact within existing processes to improve itself. In addition, the exposure of metadata related to data elements within a process can highly benefit other interested parties in the company.

Some teams may not agree that the activity of collecting metadata is part of their responsibilities, so they may resist or refuse to become engaged in these efforts. A key element of continuous improvement is finding ways to overcome this resistance. The following actions should be considered to increase adherence to metadata collaboration:

 Work with the data governance council to define and implement an enterprisewide metadata management policy that will help drive specific metadata management expectations and requirements across the enterprise.

 Improve the metadata repository constantly. The more the repository is populated, the more teams will want to be part of it. It is a chicken-and-egg situation. The repository needs contribution from subject matter experts, but these people will resist doing so if they do not see potential return. Metadata teams will have to look for the low-hanging fruit, which means to find areas where metadata can be easily extracted to sell the metadata management idea to other teams.

 Improve training. Much of the time, teams resist certain changes because they are not properly educated about them. Training sessions with resisting teams can help overcoming their reluctance and make them more welcoming. Experiences from other metadata user teams can be presented as case histories to show value.

 Propose that a metadata management team member work more closely with subject matter experts in that group to aid the process of collecting metadata information and provide training at the same time.

 Collect metrics from contributing teams to use them as proof of value added, and publish them accordingly.

Metadata management has to become part of the culture. To achieve that goal, it is important to embed metadata practices into existing processes. Organizations are constantly capturing metadata, but they need to formalize it and gather it collectively in a shared repository to be most effective.

Metadata distribution. If metadata is not used, there should be little or no reason to collect it. Companies might sell the idea of the importance of metadata management and allocate the proper resources for collecting metadata, but if consumers do not buy into it, that completely defeats the purpose. Companies will have to go to great lengths to effectively collect and maintain metadata. But the effort cannot end there. They need to make sure the published metadata reaches the right audience.

A metadata repository with an easy-to-use and easily accessible interface is a great start. Training in its use is also very necessary, and it needs to be tailored to specific audiences due to the wide range of user skills. Remember that a metadata repository can have a large number of metadata types and categories. It can include many business, technical, and operational types of metadata, and it will be used by business and technical teams. Therefore, to get the most out of it, its users must know how to find the information they are looking for. If the repository’s usability is poor, they will tend to stop using it, hence compromising the value of the metadata management program. From a continuous improvement point of view, it is important to understand the issues that users are having with the repository and find ways to make it more user-friendly.

Another path to pursue is to integrate the metadata repository with applications or other technological components within the company. For example, some metadata tools can associate a Uniform Resource Locator (URL) address to a particular piece of metadata. This URL can be used within the user interface (UI) screen of another application. When users are navigating through the application, they have the option to click a link associated with a UI element, which will take them to a metadata page with descriptions of that element and other metadata associated with it. This approach will increase the usage of existing metadata and will create demand for more. A continuous improvement process must explore and test alternatives to make metadata information be more permeated and seamless across the enterprise.

Conclusion

This chapter discussed the need and opportunities for continuous improvement in a multi-domain MDM program, indicating that continuous improvement is an implied (if not explicit) activity in the MDM program’s roadmap and maturity goals. Continues improvement is implied because the overall MDM program is expected to mature. However, this is not a given, as continuous improvement requires many enabling factors involving people, process, and technology that need to be examined across each MDM discipline to determine what improvement opportunities can most benefit the program and where lack of improvement will be an inhibiting factor. This chapter pointed out that an MDM program’s momentum is likely to slow or stop if certain enabling factors, capabilities, and improvement opportunities are not achieved because they were not forecast or coordinated well with appropriate planning processes and decision-making groups.

Continuous improvement needs to represent an ongoing focus that can be translated into reasonable and applicable improvement targets that will support the program’s sustainability and maturity. Improvement targets should relate to the program roadmap and be part of annual budget-planning reviews.

Because this chapter concludes this book, it is important to reiterate a few key points about planning and implementing a multi-domain MDM program:

 A multi-domain MDM strategy requires patience, persistence, adaptation, maturity, and the ability to act on opportunities as they emerge across the enterprise.

 Multi-domain MDM can be very much like a jigsaw puzzle—a big picture that you have to piece together. These pieces can come together from various locations in the puzzle, but it should not be a random process.

 The strategy has to consider how to address and improve certain parts of the MDM plan first and others later.

A multi-domain MDM program requires a constant focus on planning, prioritization, and improvement. When these activities are well positioned and well orchestrated, the puzzle pieces will continue to come together to form a big picture that will represent achievement of the program goals and objectives.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset