Meta-data Management is the ninth Data Management Function in the data management framework shown in Figures 1.3 and 1.4. It is the eighth data management function that interacts with and is influenced by the Data Governance function. Chapter 11 defines the meta-data management function and explains the concepts and activities involved in meta-data management.
11.1 Introduction
Meta-data is “data about data”, but what exactly does this commonly used definition mean? Meta-data is to data what data is to real-life. Data reflects real life transactions, events, objects, relationships, etc. Meta-data reflects data transactions, events, objects, relationships, etc.
Meta-data Management is the set of processes that ensure proper creation, storage, integration, and control to support associated usage of meta-data.
To understand meta-data’s vital role in data management, draw an analogy to a card catalog in a library. The card catalog identifies what books are stored in the library and where they are located within the building. Users can search for books by subject area, author, or title. Additionally, the card catalog shows the author, subject tags, publication date, and revision history of each book. The card catalog information helps to determine which books will meet the reader’s needs. Without this catalog resource, finding books in the library would be difficult, time-consuming, and frustrating. A reader may search many incorrect books before finding the right book if a card catalog did not exist.
Meta-data management, like the other data management functions, is represented in a context diagram. The context diagram for meta-data management, shown in Figure 11.1, is a short-hand representation of the functions described in this chapter. Meta-data management activities are in the center, surrounded by the relevant environmental aspects. Key definitional concepts in meta-data management are at the top of the diagram.
Leveraging meta-data in an organization can provide benefits in the following ways:
Figure 11.1 Meta-data Management Context Diagram
11.2 Concepts and Activities
Meta-data is the card catalog in a managed data environment. Abstractly, meta-data is the descriptive tags or context on the data (the content) in a managed data environment. Meta-data shows business and technical users where to find information in data repositories. Meta-data also provides details on where the data came from, how it got there, any transformations, and its level of quality; and it provides assistance with what the data really means and how to interpret it.
11.2.1 Meta-data Definition
Meta-data is information about the physical data, technical and business processes, data rules and constraints, and logical and physical structures of the data, as used by an organization. These descriptive tags describe data (e.g. databases, data elements, data models), concepts (e.g. business processes, application systems, software code, technology infrastructure), and the connections (relationships) between the data and concepts.
Meta-data is a broad term that includes many potential subject areas. These subject areas include:
11.2.1.1 Types of Meta-data
Meta-data is classified into four major types: business, technical and operational, process, and data stewardship.
Business meta-data includes the business names and definitions of subject and concept areas, entities, and attributes; attribute data types and other attribute properties; range descriptions; calculations; algorithms and business rules; and valid domain values and their definitions. Business meta-data relates the business perspective to the meta-data user.
Examples of business meta-data include:
Technical and operational meta-data provides developers and technical users with information about their systems. Technical meta-data includes physical database table and column names, column properties, other database object properties, and data storage. The database administrator needs to know users patterns of access, frequency, and report / query execution time. Capture this meta-data using routines within a DBMS or other software.
Operational meta-data is targeted at IT operations users’ needs, including information about data movement, source and target systems, batch programs, job frequency, schedule anomalies, recovery and backup information, archive rules, and usage.
Examples of technical and operational meta-data include:
Process meta-data is data that defines and describes the characteristics of other system elements (processes, business rules, programs, jobs, tools, etc.).
Examples of process meta-data include:
Data stewardship meta-data is data about data stewards, stewardship processes, and responsibility assignments. Data stewards assure that data and meta-data are accurate, with high quality across the enterprise. They establish and monitor sharing of data.
Examples of data stewardship meta-data include:
11.2.1.2 Meta-data for Unstructured Data
All data is somewhat structured, so the notion of unstructured meta-data is a misnomer. A better term is “meta-data for unstructured data.” Unstructured data is highly structured, although using differing methods. Generally, consider unstructured data to be any data that is not in a database or data file, including documents or other media data. See Chapter 10 for more information on this topic.
Meta-data describes both structured and unstructured data. Meta-data for unstructured data exists in many formats, responding to a variety of different requirements. Examples of meta-data repositories describing unstructured data include content management applications, university websites, company intranet sites, data archives, electronic journals collections, and community resource lists. A common method for classifying meta-data in unstructured sources is to describe them as descriptive meta-data, structural meta-data, or administrative meta-data.
Examples of descriptive meta-data include:
Examples of structural meta-data include:
Examples of administrative meta-data include:
Bibliographic meta-data, record-keeping meta-data, and preservation meta-data are all meta-data schemes applied to documents, but from different focuses. Bibliographic meta-data is the library card of the document. Record-keeping meta-data is concerned with validity and retention. Preservation meta-data is concerned with storage, archival condition, and conservation of material.
11.2.1.3 Sources of Meta-data
Meta-data is everywhere in every data management activity. The identification information on any data is meta-data that is of potential interest to some user group. Meta-data is integral to all IT systems and applications. Use these sources to meet technical meta-data requirements. Create business meta-data through user interaction, definition, and analysis of data. Add quality statements and other observations on the data to the meta-data repository or to source meta-data in IT systems through some support activity. Identify meta-data at an aggregate (such as subject area, system characteristic) or detailed (such as database column characteristic, code value) level. Proper management and navigation between related meta-data is an important usage requirement.
Primary sources of meta-data are numerous—virtually anything named in an organization. Secondary sources are other meta-data repositories, accessed using bridge software. Many data management tools create and use repositories for their own use. Their vendors also provide additional software to enable links to other tools and meta-data repositories, sometimes called bridge applications. However, this functionality mostly enables replication of meta-data between repositories, not true linkages.
11.2.2 Meta-data History 1990 - 2008
In the 1990s, some business managers finally began to recognize the value of meta-data repositories. Newer tools expanded the scope of the meta-data they addressed to include business meta-data. Some of the potential benefits of business meta-data identified in the industry during this period included:
The mid to late 1990’s saw meta-data becoming more relevant to corporations who were struggling to understand their information resources. This was mostly due to the pending Y2K deadline, emerging data warehousing initiatives, and a growing focus around the World Wide Web. Efforts to try to standardize meta-data definition and exchange between applications in the enterprise were begun.
Examples of standardization include the CASE Definition Interchange Facility (CDIF) developed by the Electronics Industries Alliance (EIA) in 1995, and the Dublin Core Metadata Elements developed by the Dublin Core Metadata Initiative (DCMI) in 1995 in Dublin, Ohio. The first parts of ISO 11179 standard for Specification and Standardization of Data Elements were published in 1994 through 1999. The Object Management Group (OMG) developed the Common Warehouse Metadata Model (CWM) in 1998. Rival Microsoft supported the Metadata Coalitions’ (MDC) Open Information Model in 1995. By 2000, the two standards merged into CWM. Many of the meta-data repositories began promising adoption of the CWM standard.
The early years of the 21st century saw the update of existing meta-data repositories for deployment on the web. Products also introduced some level of support for CWM. During this period, many data integration vendors began focusing on meta-data as an additional product offering. However, relatively few organizations actually purchased or developed meta-data repositories, let alone achieved the ideal of implementing an effective enterprise-wide Managed Meta-data Environment, as defined in Universal Meta-data Models for several reasons:
As the current decade proceeds, companies are beginning to focus more on the need for, and importance of, meta-data. Focus is also expanding on how to incorporate meta-data beyond the traditional structured sources and include unstructured sources. Some of the factors driving this renewed interest in meta-data management are:
The history of meta-data management tools and products seems to be a metaphor for the lack of a methodological approach to enterprise information management that is so prevalent in organizations. The lack of standards and the proprietary nature of most managed meta-data solutions, cause many organizations to avoid focusing on meta-data, limiting their ability to develop a true enterprise information management environment. Increased attention given to information and its importance to an organization’s operations and decision-making will drive meta-data management products and solutions to become more standardized. This driver gives more recognition to the need for a methodological approach to managing information and meta-data.
11.2.3 Meta-data Strategy
A meta-data strategy is a statement of direction in meta-data management by the enterprise. It is a statement of intent and acts as a reference framework for the development teams. Each user group has its own set of needs from a meta-data application. Working through a meta-data requirements development process provides a clear understanding of expectations and the reasons for the requirements.
Build a meta-data strategy from a set of defined components. The primary focus of the meta-data strategy is to gain an understanding of and consensus on the organization’s key business drivers, issues, and information requirements for the enterprise meta-data program. The objective is to understand how well the current environment meets these requirements, both now and in the future.
The objectives of the strategy define the organization’s future enterprise meta-data architecture. They also recommend the logical progression of phased implementation steps that will enable the organization to realize the future vision. Business objectives drive the meta-data strategy, which defines the technology and processes required to meet these objectives. The result of this process is a list of implementation phases driven by business objectives and prioritized by the business value they bring to the organization, combined with the level of effort required to deliver them. The phases include:
11.2.4 Meta-data Management Activities
Effective meta-data management depends on data governance (see Chapter 3) to enable business data stewards to set meta-data management priorities, guide program investments, and oversee implementation efforts within the larger context of government and industry regulations.
11.2.4.1 Understand Meta-data Requirements
A meta-data management strategy must reflect an understanding of enterprise needs for meta-data. These requirements are gathered to confirm the need for a meta-data management environment, to set scope and priorities, educate and communicate, to guide tool evaluation and implementation, guide meta-data modeling, guide internal meta-data standards, guide provided services that rely on meta-data, and to estimate and justify staffing needs. Obtain these requirements from both business and technical users in the organization. Distill these requirements from an analysis of roles, responsibilities, challenges, and the information needs of selected individuals in the organization, not from asking for meta-data requirements.
11.2.4.1.1 Business User Requirements
Business users require improved understanding of the information from operational and analytical systems. Business users require a high level of confidence in the information obtained from corporate data warehouses, analytical applications, and operational systems. They need tailored access per their role to information delivery methods, such as reports, queries, push (scheduled), ad-hoc, OLAP, dashboards, with a high degree of quality documentation and context.
For example, the business term royalty is negotiated by the supplier and is factored into the amount paid by the retailer and, ultimately, by the consumer. These values represent data elements that are stored in both operational and analytical systems, and they appear in key financial reports, OLAP cubes, and data mining models. The definitions, usage, and algorithms need to be accessible when using royalty data. Any meta-data on royalty that is confidential or might be considered competitive information, requires controlled use by authorized user groups.
Business users must understand the intent and purpose of meta-data management. To provide meaningful business requirements, users must be educated about the differences between data and meta-data. It is a challenge to keep business users’ focus limited to meta-data requirements versus other data requirements. Facilitated meetings (interviews and / or JAD sessions) with other business users with similar roles (e.g., the finance organization) are a very effective means of identifying requirements and maintaining focus on the meta-data and contextual needs of the user group.
Also critical to meta-data management success is the establishment of a data governance organization. The data governance organization is responsible for setting the direction and goals of the initiative and for making the best decisions regarding products, vendor support, technical architectures, and general strategy. Frequently, the Data Governance Council serves as the governing body for data and meta-data direction and requirements.
11.2.4.1.2 Technical User Requirements
High-level technical requirement topics include:
Technical users include Database Administrators (DBAs), Meta-data Specialists and Architects, IT support staff, and developers. Typically, these are the custodians of the corporate information assets. These users must understand the technical implementation of the data thoroughly, including both atomic-level details, data integration points, interfaces, and mappings. Additionally, they must understand the business context of the data at a sufficient level to provide the necessary support, including implementing the calculations or derived data rules and integration programs that the business users specify.
11.2.4.2 Define the Meta-data Architecture
Conceptually, all meta-data management solutions or environments consist of the following architectural layers: meta-data creation / sourcing, meta-data integration, one or more meta-data repositories, meta-data delivery, meta-data usage, and meta-data control / management.
A meta-data management system must be capable of extracting meta-data from many sources. Design the architecture to be capable of scanning the various meta-data sources and periodically updating the repository. The system must support the manual updates of meta-data, requests, searches, and lookups of meta-data by various user groups.
A managed meta-data environment should isolate the end user from the various and disparate meta-data sources. The architecture should provide a single access point for the meta-data repository. The access point must supply all related meta-data resources transparently to the user. Transparent means that the user can access the data without being aware of the differing environments of the data sources.
Design of the architecture of the above components depends on the specific requirements of the organization. Three technical architectural approaches to building a common meta-data repository mimic the approaches to designing data warehouses: centralized, distributed, and hybrid. These approaches all take into account implementation of the repository and how the update mechanisms operate. Each organization must choose the architecture that best suits their needs.
11.2.4.2.1 Centralized Meta-data Architecture
A centralized architecture consists of a single meta-data repository that contains copies of the live meta-data from the various sources. Organizations with limited IT resources, or those seeking to automate as much as possible, may choose to avoid this architecture option. Monitor processes and create a new set of roles in IT to support these new processes. Organizations with prioritization for a high degree of consistency and uniformity within the common meta-data repository can benefit from a centralized architecture.
Advantages of a centralized repository include:
Some limitations of the centralized approach include:
11.2.4.2.2 Distributed Meta-data Architecture
A completely distributed architecture maintains a single access point. The meta-data retrieval engine responds to user requests by retrieving data from source systems in real time; there is no persistent repository. In this architecture, the meta-data management environment maintains the necessary source system catalogs and lookup information needed to process user queries and searches effectively. A common object request broker or similar middleware protocol accesses these source systems.
Advantages of distributed meta-data architecture include:
In addition, the following limitations exist for distributed architectures:
11.2.4.2.3 Hybrid Meta-data Architecture
A combined alternative is the hybrid architecture. Meta-data still moves directly from the source systems into a repository. However, the repository design only accounts for the user-added meta-data, the critical standardized items, and the additions from manual sources.
The architecture benefits from the near-real-time retrieval of meta-data from its source and enhanced meta-data to meet user needs most effectively, when needed. The hybrid approach lowers the effort for manual IT intervention and custom-coded access functionality to proprietary systems. The meta-data is as current and valid as possible at the time of use, based on user priorities and requirements. Hybrid architecture does not improve system availability.
The availability of the source systems is a limit, because the distributed nature of the back-end systems handles processing of queries. Additional overhead is required to link those initial results with meta-data augmentation in the central repository before presenting the result set to the end user.
Organizations that have rapidly changing meta-data, a need for meta-data consistency and uniformity, and a substantial growth in meta-data and meta-data sources, can benefit from a hybrid architecture. Organizations with more static meta-data and smaller meta-data growth profiles may not see the maximum potential from this architecture alternative.
Another advanced architectural approach is the Bi-Directional Meta-data Architecture, which allows meta-data to change in any part of the architecture (source, ETL, user interface) and then feed back from the repository into its original source. The repository is a broker for all updates. Commercial software packages are in development to include this internal feature, but the standards are still developing.
Various challenges are apparent in this approach. The design forces the meta-data repository to contain the latest version of the meta-data source and forces it to manage changes to the source, as well. Changes must be trapped systematically, then resolved. Additional sets of program / process interfaces to tie the repository back to the meta-data source(s) must be built and maintained.
11.2.4.3 Meta-data Standards Types
Two major types of meta-data standards exist: industry or consensus standards, and international standards. Generally, the international standards are the framework from which the industry standards are developed and executed. A dynamic framework for meta-data standards, courtesy of Ashcomp.com is available on the DAMA International website, www.dama.org. The high-level framework in Figure 11.2 shows how standards are related and how they rely on each other for context and usage. The diagram also gives a glimpse into the complexity of meta-data standards and serves as a starting point for standards discovery and exploration.
Figure 11.2 High Level Standards Framework
11.2.4.3.1 Industry / Consensus Meta-data Standards
Understanding the various standards for the implementation and management of meta-data in industry is essential to the appropriate selection and use of a meta-data solution for an enterprise. One area where meta-data standards are essential is in the exchange of data with operational trading partners. The establishment of the electronic data interchange (EDI) format represents an early meta-data format standard included in EDI tools. Companies realize the value of information sharing with customers, suppliers, partners, and regulatory bodies. Therefore, the need for sharing common meta-data to support the optimal usage of shared information has spawned sector-based standards.
Vendors provide XML support for their data management products for data exchange. They use the same strategy to bind their tools together into suites of solutions. Technologies, including data integration, relational and multidimensional databases, requirements management, business intelligence reporting, data modeling, and business rules, offer import and export capabilities for data and meta-data using XML. While XML support is important, the lack of XML schema standards makes it a challenge to integrate the required meta-data across products. Vendors maintain their proprietary XML schemas and document type definitions (DTD). These are accessed though proprietary interfaces, so integration of these tools into a meta-data management environment still requires custom development.
Some noteworthy industry meta-data standards are:
Figure 11.3 CWM Metamodel 9
Based on OMG’s standards, Model Driven Architecture (MDA) separates business and application logic from platform technology. A platform-independent model of an application or system’s business functionality and behavior can be realized on virtually any platform using UML and MOF (Meta-Object Facility) technology standards. In this architectural approach, there is a framework for application package vendors to adopt that permits flexibility in the package implementation, so that the product can meet varied market needs. The MDA has less direct impact on an organization’s particular implementation of a package.
Organizations planning for meta-data solution deployment should adopt a set of established meta-data standards early in the planning cycle that are industry-based and sector-sensitive. Use the adopted standard in the evaluation and selection criteria for all new meta-data management technologies. Many leading vendors support multiple standards, and some can assist in customizing industry-based and / or sector-sensitive standards.
11.2.4.3.2 International Meta-data Standards
A key international meta-data standard is International Organization for Standardization ISO / IEC 11179 that describes the standardizing and registering of data elements to make data understandable and shareable.
The purpose of ISO / IEC 11179 is to give concrete guidance on the formulation and maintenance of discrete data element descriptions and semantic content (meta-data) that is useful in formulating data elements in a consistent, standard manner. It also provides guidance for establishing a data element registry.
The standard is important guidance for industry tool developers but is unlikely to be a concern for organizations who implement using commercial tools, since the tools should meet the standards. However, portions of each part of ISO / IEC 11179 may be useful to organizations that want to develop their own internal standards, since the standard contains significant details on each topic.
Relevant parts of the International Standard ISO / IEC 11179 are:
11.2.4.4 Standard Meta-data Metrics
Controlling the effectiveness of the meta-data deployed environment requires measurements to assess user uptake, organizational commitment, and content coverage and quality. Metrics should be primarily quantitative rather than qualitative in nature.
Some suggested metrics on meta-data environments include:
11.2.4.5 Implement a Managed Meta-data Environment
Implement a managed meta-data environment in incremental steps in order to minimize risks to the organization and to facilitate acceptance.
Often, the first implementation is a pilot to prove concepts and learn about managing the meta-data environment. A pilot project has the added complexity of a requirements assessment, strategy development, technology evaluation selection, and initial implementation cycle that subsequent incremental projects will not have. Subsequent cycles will have roadmap planning, staff training and organization changes, and an incremental rollout plan with assessment and re-assessment steps, as necessary. Integration of meta-data projects into current IS / IT development methodology is necessary.
Topics for communication and planning for a meta-data management initiative include discussions and decisions on the strategies, plans, and deployment, including:
11.2.4.6 Create and Maintain Meta-data
Use of a software package means the data model of the repository does not need to be developed, but it is likely to need tailoring to meet the organization’s needs. If a custom solution is developed, creating the data model for the repository is one of the first design steps after the meta-data strategy is complete and the business requirements are fully understood.
The meta-data creation and update facility provides for the periodic scanning and updating of the repository, in addition to the manual insertion and manipulation of meta-data by authorized users and programs. An audit process validates activities and reports exceptions.
If meta-data is a guide to the data in an organization, then its quality is critical. If data anomalies exist in the organization sources, and if these appear correctly in the meta-data, then the meta-data can guide the user through that complexity. Doubt about the quality of meta-data in the repository can lead to total rejection of the meta-data solution, and the end of any support for continued work on meta-data initiatives. Therefore, it is critical to deal with the quality of the meta-data, not only its movement and consolidation. Of course, quality is also subjective, so business involvement in establishing what constitutes quality in their view is essential.
Low-quality meta-data creates:
High quality meta-data creates:
11.2.4.7 Integrate Meta-data
Integration processes gather and consolidate meta-data from across the enterprise, including meta-data from data acquired outside the enterprise. Integrate extracted meta-data from a source meta-data store with other relevant business and technical meta-data into the meta-data storage facility. Meta-data can be extracted using adaptors / scanners, bridge applications, or by directly accessing the meta-data in a source data store. Adaptors are available with many third party vendor software tools, as well as from the meta-data integration tool selected. In some cases, adaptors must be developed using the tool API’s.
Challenges arise in integration that will require some form of appeal through the governance process for resolution. Integrating internal data sets, external data such as Dow Jones or government statistics organizations, and data sourced from non-electronic form-such as white papers, articles in magazines, or reports can raise numerous questions on quality and semantics.
Accomplish repository scanning in two distinct manners.
A scanning process produces and leverages several types of files during the process.
Use a non-persistent meta-data staging area to store temporary and backup files. The staging area supports rollback and recovery processes, and provides an interim audit trail to assist repository managers when investigating meta-data source or quality issues. The staging area may take the form of a directory of files or a database. Truncate staging area database tables prior to a new meta-data feed that utilizes the staging table, or timestamp versions of the same storage format.
ETL tools used for data warehousing and Business Intelligence applications are often used effectively in meta-data integration processes.
11.2.4.8 Manage Meta-data Repositories
Implement a number of control activities in order to manage the meta-data environment. Control of repositories is control of meta-data movement and repository updates performed by the meta-data specialist. These activities are administrative in nature and involve monitoring and responding to reports, warnings, job logs, and resolving various issues in the implemented repository environment. Many of the control activities are standard for data operations, and interface maintenance.
Control activities include:
11.2.4.8.1 Meta-data Repositories
Meta-data repository refers to the physical tables in which the meta-data are stored. Implement meta-data repositories using an open relational database platform. This allows development and implementation of various controls and interfaces that may not be anticipated at the start of a repository development project.
The repository contents should be generic in design, not merely reflecting the source system database designs. Design contents in alignment with the enterprise subject area experts, and based on a comprehensive meta-data model. The meta-data should be as integrated as possible—this will be one of the most direct valued-added elements of the repository. It should house current, planned, and historical versions of the meta-data.
For example, the business meta-data definition for Customer could be “Anyone that has purchased a product from our company within one of our stores or through our catalog”. A year later, the company adds a new distribution channel. The company constructs a Web site to allow customers to order products. At that point, the business meta-data definition for customer changes to “Anyone that has purchased a product from our company within one of our stores, through our mail order catalog or through the web.”
11.2.4.8.2 Directories, Glossaries and Other Meta-data Stores
A Directory is a type of meta-data store that limits the meta-data to the location or source of data in the enterprise. Tag sources as system of record (it may be useful to use symbols such as “gold”) or other level of quality. Indicate multiple sources in the directory. A directory of meta-data is particularly useful to developers and data super users, such as data stewardship teams and data analysts.
A Glossary typically provides guidance for use of terms, and a thesaurus can direct the user through structural choices involving three kinds of relationships: equivalence, hierarchy, and association. These relationships can be specified against both intra- and inter-glossary source terms. The terms can link to additional information stored in a meta-data repository, synergistically enhancing usefulness.
A multi-source glossary should be capable of the following:
Other Meta-data stores include specialized lists such as source lists or interfaces, code sets, lexicons, spatial and temporal schema, spatial reference, and distribution of digital geographic data sets, repositories of repositories, and business rules.
11.2.4.9 Distribute and Deliver Meta-data
The meta-data delivery layer is responsible for the delivery of the meta-data from the repository to the end users and to any applications or tools that require meta-data feeds to them.
Some delivery mechanisms:
The meta-data solution often links to a Business Intelligence solution, so that both the universe and currency of meta-data in the solution synchronizes with the BI contents. The link provides a means of integration into the delivery of the BI to the end user. Similarly, some CRM or other ERP solutions may require meta-data integration at the application delivery layer.
Occasionally, meta-data is exchanged with external organizations through flat files; however, it is more common for companies to use XML as transportation syntax through proprietary solutions.
11.2.4.10 Query, Report and Analyze Meta-data
Meta-data guides how we use data assets. We use meta-data in business intelligence (reporting and analysis), business decisions (operational, tactical, strategic), and in business semantics (what we say, what we mean - ‘business lingo’).
Meta-data guides how we manage data assets. Data governance processes use meta-data to control and govern. Information system implementation and delivery uses meta-data to add, change, delete, and access data. Data integration (operational systems, DW / BI systems) refers to data by its tags or meta-data to achieve that integration. Meta-data controls and audits data, process, and system integration. Database administration is an activity that controls and maintains data through its tags or meta-data layer, as does system and data security management. Some quality improvement activities are initiated through inspection of meta-data and its relationship to associated data.
A meta-data repository must have a front-end application that supports the search-and-retrieval functionality required for all this guidance and management of data assets. The interface provided to business users may have a different set of functional requirements than that for technical users and developers. Some reports facilitate future development such as change impact analysis, or trouble shoot varying definitions for data warehouse and business intelligence projects, such as data lineage reports.
11.3 Summary
The guiding principles for implementing meta-data management into an organization, a summary table of the roles for each meta-data management activity, and organization and cultural issues that may arise during meta-data management are summarized below.
11.3.1 Guiding Principles
The guiding principles for establishing a meta-data management function are listed below.
11.3.2 Process Summary
The process summary for the meta-data management function is shown in Table 11.1. The deliverables, responsible roles, approving roles, and contributing roles are shown for each activity in the meta-data management function. The Table is also shown in Appendix A9.
Activities |
Deliverables |
Responsible Roles |
Approving Roles |
Contributing Roles |
9.1 Understand Meta-data Requirements (P) |
Meta-data requirements |
Meta-data Specialists Data Stewards Data Architects and Modelers Database Administrators |
Enterprise Data Architect, DM Leader, Data Stewardship Committee |
Other IT Professionals Other DM Professionals |
9.2 Define the Meta-data Architecture (P) |
Meta-data architecture |
Meta-data Architects, Data Integration Architects |
Enterprise Data Architect, DM Leader, CIO Data Stewardship Committee Database Administrators |
Meta-data Specialists, Other Data Mgmt. Professionals Other IT Professionals |
9.3 Develop and Maintain Meta-data Standards (P) |
Meta-data standards |
Meta-data and Data Architects Data Stewards Database Administrators |
Enterprise Data Architect, DM Leader, Data Stewardship Committee |
Other IT Professionals Other DM Professionals |
9.4 Implement a Managed Meta-data Environment (D) |
Meta-data metrics |
Database Administrators |
Enterprise Data Architect, DM Leader, Data Stewardship Committee |
Other IT Professionals |
9.5 Create and Maintain Meta-data (O) |
Updated:
Reference and Master Data Management Tools |
Meta-data Specialists Data Stewards Data Architects and Modelers Database Administrators |
Enterprise Data Architect, DM Leader, Data Stewardship Committee |
Other IT Professionals |
9.6 Integrate Meta-data (C) |
Integrated Meta-data repositories |
Integration Data Architects Meta-data Specialists Data Stewards Data Architects and Modelers Database Administrators |
Enterprise Data Architect, DM Leader, Data Stewardship Committee |
Other IT Professionals |
9.7 Manage Meta-data Repositories (C) |
Managed Meta-data repositories Administration Principles, Practices, Tactics |
Meta-data Specialists Data Stewards Data Architects and Modelers Database Administrators |
Enterprise Data Architect, DM Leader, Data Stewardship Committee |
Other IT Professionals |
9.8 Distribute and Deliver Meta-data (O) |
Distribution of Meta-data Meta-data Models and Architecture |
Database Administrators |
Enterprise Data Architect, DM Leader, Data Stewardship Committee |
Meta-data Architects |
9.9 Query, Report and Analyze Meta-data (O) |
Quality Meta-data Meta-data Management Operational Analysis Meta-data Analysis Data Lineage Change Impact Analysis |
Data Analysts, Meta-data Analysts |
Enterprise Data Architect, DM Leader, Data Stewardship Committee |
Business Intelligence Specialists, Data Integration Specialists, Database Administrators, Other Data Mgmt. Professionals |
Table 11.1 Meta-data Management Process Summary
11.3.3 Organizational and Cultural Issues
Many organizational and cultural issues exist for a meta-data management initiative. Organizational readiness is a major concern, as are methods for governance and control.
Q1: Meta-data Management is a low priority in many organizations. What are the core arguments or value-add statements for Meta-data management?
A1: An essential set of meta-data needs coordination in an organization. It can be structures of employee identification data, insurance policy numbers, vehicle identification numbers, or product specifications, which if changed, would require major overhauls of many enterprise systems. Look for that good example where control will reap immediate quality benefits for data in the company. Build the argument from concrete business-relevant examples.
Q2: How does Meta-data Management relate to Data Governance? Don’t we govern through meta-data rules?
A2: Yes! Meta-data is governed much as data is governed, through principles, policies and effective and active stewardship. Read up on Data Governance in Chapter 3.
11.4 Recommended Reading
The references listed below provide additional reading that support the material presented in Chapter 11. These recommended readings are also included in the Bibliography at the end of the Guide.
11.4.1 General Reading
Brathwaite, Ken S. Analysis, Design, and Implementation of Data Dictionaries. McGraw-Hill Inc., 1988. ISBN 0-07-007248-5. 214 pages.
Collier, Ken. Executive Report, Business Intelligence Advisory Service, Finding the Value in Metadata Management (Vol. 4, No. 1), 2004. Available only to Cutter Consortium Clients, http://www.cutter.com/bia/fulltext/reports/2004/01/index.html.
Hay, David C. Data Model Patterns: A Metadata Map. Morgan Kaufmann, 2006. ISBN 0-120-88798-3. 432 pages.
Hillmann, Diane I. and Elaine L. Westbrooks, editors. Metadata in Practice. American Library Association, 2004. ISBN 0-838-90882-9. 285 pages.
Inmon, William H., Bonnie O’Neil and Lowell Fryman. Business Metadata: Capturing Enterprise Knowledge. 2008. Morgan Kaufmann ISBN 978-0-12-373726-7. 314 pages.
Marco, David, Building and Managing the Meta Data Repository: A Full Life-Cycle Guide. John Wiley & Sons, 2000. ISBN 0-471-35523-2. 416 pages.
Marco, David and Michael Jennings. Universal Meta Data Models. John Wiley & Sons, 2004. ISBN 0-471-08177-9. 478 pages.
Poole, John, Dan Change, Douglas Tolbert and David Mellor. Common Warehouse Metamodel: An Introduction to the Standard for Data Warehouse Integration. John Wiley & Sons, 2001. ISBN 0-471-20052-2. 208 pages.
Poole, John, Dan Change, Douglas Tolbert and David Mellor. Common Warehouse Metamodel Developer’s Guide. John Wiley & Sons, 2003. ISBN 0-471-20243-6. 704 pages.
Ross, Ronald. Data Dictionaries And Data Administration: Concepts and Practices for Data Resource Management. New York: AMACOM Books, 1981. ISN 0-814-45596-4. 454 pages.
Tannenbaum, Adrienne. Implementing a Corporate Repository, John Wiley & Sons, 1994. ISBN 0-471-58537-8. 441 pages.
Tannenbaum, Adrienne. Metadata Solutions: Using Metamodels, Repositories, XML, And Enterprise Portals to Generate Information on Demand. Addison Wesley, 2001. ISBN 0-201-71976-2. 528 pages.
Wertz, Charles J. The Data Dictionary: Concepts and Uses, 2nd edition. John Wiley & Sons, 1993. ISBN 0-471-60308-2. 390 pages.
11.4.2 Meta-data in Library Science
Baca, Murtha, editor. Introduction to Metadata: Pathways to Digital Information. Getty Information Institute, 2000. ISBN 0-892-36533-1. 48 pages.
Hillmann, Diane I., and Elaine L. Westbrooks. Metadata in Practice. American Library Association, 2004. ISBN 0-838-90882-9. 285 pages.
Karpuk, Deborah. METADATA: From Resource Discovery to Knowledge Management. Libraries Unlimited, 2007. ISBN 1-591-58070-6. 275 pages.
Liu, Jia. Metadata and Its Applications in the Digital Library. Libraries Unlimited, 2007. ISBN 1-291-58306-6. 250 pages.
11.4.3 Geospatial Meta-data Standards
http://www.fgdc.gov/metadata/geospatial-metadata-standards.
11.4.4 ISO Meta-data Standards
ISO Standards Handbook 10, Data Processing—Vocabulary, 1982.
ISO 704:1987, Principles and methods of terminology.
ISO 1087, Terminology—Vocabulary.
ISO 2382-4:1987, Information processing systems—Vocabulary part 4.
ISO/IEC 10241:1992, International Terminology Standards—Preparation and layout.
FCD 11179-2, Information technology—Specification and standardization of data elements - Part 2: Classification for data elements.
ISO/IEC 11179-3:1994, Information technology—Specification and standardization of data elements - Part 3: Basic attributes of data elements.
ISO/IEC 11179-4:1995, Information technology—Specification and standardization of data elements - Part 4: Rules and guidelines for the formulation of data definitions.
ISO/IEC 11179-5:1995, Information technology—Specification and standardization of data elements - Part 5: Naming and identification principles for data elements.
ISO/IEC 11179-6:1997, Information technology—Specification and standardization of data elements - Part 6: Registration of data elements.