4 Data Architecture Management

Data Architecture Management is the second data management function in the Data Management Framework shown in Figures 1.3 and 1.4. It is the first data management function that interacts with and is influenced by the data governance function. Chapter 4 defines the data architecture management function and explains the concepts and activities involved in data architecture management.

4.1 Introduction

Data Architecture Management is the process of defining and maintaining specifications that:

  • Provide a standard common business vocabulary,
  • Express strategic data requirements,
  • Outline high level integrated designs to meet these requirements, and
  • Align with enterprise strategy and related business architecture.

Data architecture is an integrated set of specification artifacts used to define data requirements, guide integration and control of data assets, and align data investments with business strategy. It is also an integrated collection of master blueprints at different levels of abstraction. Data architecture includes formal data names, comprehensive data definitions, effective data structures, precise data integrity rules, and robust data documentation.

Data architecture is most valuable when it supports the information needs of the entire enterprise. Enterprise data architecture enables data standardization and integration across the enterprise. This chapter will focus on enterprise data architecture, although the same techniques apply to the more limited scope of a specific function or department within an organization.

Enterprise data architecture is part of the larger enterprise architecture, where data architecture integrates with other business and technology architecture. Enterprise architecture integrates data, process, organization, application, and technology architecture. It helps organizations manage change and improve effectiveness, agility, and accountability.

The context of the Data Architecture Management function is shown in the diagram in Figure 4.1.

Figure 4.1 Data Architecture Management Diagram

Enterprise data architecture is an integrated set of specifications and documents. It includes three major categories of specifications:

  1. The enterprise data model: The heart and soul of enterprise data architecture,
  2. The information value chain analysis: Aligns data with business processes and other enterprise architecture components, and
  3. Related data delivery architecture: Including database architecture, data integration architecture, data warehousing / business intelligence architecture, document content architecture, and meta-data architecture.

Enterprise data architecture is really a misnomer. It is about more than just data; it is also about terminology. Enterprise data architecture defines standard terms for the things that are important to the organization–things important enough to the business that data about these things is necessary to run the business. These things are business entities. Perhaps the most important and beneficial aspect of enterprise data architecture is establishing a common business vocabulary of business entities and the data attributes (characteristics) that matter about these entities. Enterprise data architecture defines the semantics of an enterprise.

4.2 Concepts and Activities

Chapter 1 stated that data architecture management is the function of defining the blueprint for managing data assets. Data architects play a key role in the critical function of data architecture management. The concepts and activities related to data architecture management and the roles of data architects are presented in this section.

4.2.1 Architecture Overview

Architecture is an organized arrangement of component elements, which optimizes the function, performance, feasibility, cost, and / or aesthetics of the overall structure or system. The word “architecture” is one of the most widely used terms in the information technology field. “Architecture” is a very evocative word–the analogy between designing buildings and designing information systems is extremely useful. Architecture is an integrated set of closely related views reflecting the issues and perspectives of different stakeholders. Understanding the architecture of something enables people to make some limited sense of something very complex, whether they are natural things (geological formations, mathematics, living organisms) or human-made things (including buildings, music, machines, organizations, processes, software, and databases).

Understanding building blueprints helps contractors build safe, functional, and aesthetically pleasing buildings within cost and time constraints. Studying anatomy (the architecture of living things) helps medical students learn how to provide medical care. People and organizations benefit from architecture when structures and systems become complex. The more complex the system, the greater the benefit derived from architecture.

Architecture may exist at different levels, from the macro-level of urban planning to the micro-level of creating machine parts. At each level, standards and protocols help ensure components function together as a whole. Architecture includes standards and their application to specific design needs.

In the context of information systems, architecture is “the design of any complex technical object or system”.

Technology is certainly complex. The field of information technology greatly benefits from architectural designs that help manage complexity in hardware and software products. Technology architecture includes both “closed” design standards specific to a particular technology vendor and “open” standards available to any vendor.

Organizations are also complex. Integrating the disparate parts of an organization to meet strategic enterprise goals often requires an overall business architecture, which may include common designs and standards for business processes, business objectives, organizational structures, and organizational roles. For organizations, architecture is all about integration. Organizations that grow by acquisition face significant integration challenges and so greatly benefit from effective architecture.

Information systems are certainly very complex. Adding more and more relatively simple isolated applications, and building tactical approaches to moving and sharing data between these “silo” applications has made the application system portfolio of most organizations resemble a plate of spaghetti. The cost of understanding and maintaining this complexity grows, and the benefits of restructuring applications and databases according to an overall architecture become more and more attractive.

4.2.1.1 Enterprise Architecture

Enterprise Architecture is an integrated set of business and IT specification models and artifacts reflecting enterprise integration and standardization requirements. Enterprise architecture defines the context for business integration of data, process, organization, and technology, and the alignment of enterprise resources with enterprise goals. Enterprise architecture encompasses both business architecture and information systems architecture.

Enterprise architecture provides a systematic approach to managing information and systems assets, addressing strategic business requirements, and enabling informed portfolio management of the organization’s projects. Enterprise architecture supports strategic decision-making by helping manage change, tracing the impact of organizational change on systems, and the business impact of changes to systems.

Enterprise architecture includes many related models and artifacts:

  • Information architecture: Business entities, relationships, attributes, definitions, reference values.
  • Process architecture: Functions, activities, workflow, events, cycles, products, procedures.
  • Business architecture: Goals, strategies, roles, organization structures, locations.
  • Systems architecture: Applications, software components, interfaces, projects.
  • Technology architecture: Networks, hardware, software platforms, standards, protocols.
  • Information value chain analysis artifacts: Mapping the relationships between data, process, business, systems, and technology.

Enterprise models generate most of the related artifacts from integrated specifications. Artifacts include graphical diagrams, tables, analysis matrices, and textual documents. These artifacts describe how the organization operates and what resources are required, in varying degrees of detail. Specifications should be traceable to the goals and objectives they support, and should conform to content and presentation standards. Few, if any, organizations have a comprehensive enterprise architecture including every potential component model and artifact.

Enterprise Architecture often distinguishes between the current “as-is” and the target “to be” perspectives, and sometimes includes intermediate stages and migration plans. Some enterprise architecture attempts to identify an ideal state as a reference model, and the target model is defined as a pragmatic, attainable step towards the ideal state. Keep enterprise architecture specifications of present state and future state current, in order to stay relevant and useful. No organization is ever completely done maintaining and enriching their enterprise architecture.

Each organization invests in developing and maintaining enterprise architecture based on their understanding of business need and business risk. Some organizations elect to define enterprise architecture in detail in order to manage risks better.

Enterprise architecture is a significant knowledge asset providing several benefits. It is a tool for planning, IT governance, and portfolio management. Enterprise architecture can:

  • Enable integration of data, processes, technologies, and efforts.
  • Align information systems with business strategy.
  • Enable effective use and coordination of resources.
  • Improve communication and understanding across the organization.
  • Reduce the cost of managing the IT infrastructure.
  • Guide business process improvement.
  • Enable organizations to respond effectively to changing market opportunities, industry challenges, and technological advances. Enterprise architecture helps evaluate business risk, manage change, and improve business effectiveness, agility, and accountability.

Methods for defining enterprise architecture include IBM’s Business Systems Planning (BSP) method and the Information Systems Planning (ISP) from James Martin’s information engineering method.2

4.2.1.2 Architectural Frameworks

Architectural frameworks provide a way of thinking about and understanding architecture, and the structures or systems requiring architecture. Architecture is complex, and architectural frameworks provide an overall “architecture for architecture.”

There are two different kinds of architectural frameworks:

  • Classification frameworks organize the structure and views that encompass enterprise architecture. Frameworks define the standard syntax for the artifacts describing these views and the relationships between these views. Most artifacts are diagrams, tables, and matrices.
  • Process frameworks specify methods for business and systems planning, analysis, and design processes. Some IT planning and software development lifecycle (SDLC) methods include their own composite classifications. Not all process frameworks specify the same set of things, and some are highly specialized.

The scope of architectural frameworks is not limited to information systems architecture. Architectural frameworks help define the logical, physical, and technical artifacts produced in software analysis and design, which guide the solution design for specific information systems. Organizations adopt architectural frameworks for IT governance and architecture quality control. Organizations may mandate delivery of certain artifacts before approval of a system design.

Many frameworks are in existence, such as:

  • TOGAF : The Open Group Architectural Framework is a process framework and standard software development lifecycle (SDLC) method developed by The Open Group, a vendor and technology neutral consortium for defining and promoting open standards for global interoperability. TOGAF Version 8 Enterprise Edition (TOGAF 8) may be licensed by any organization, whether members or non-members of The Open Group.
  • ANSI / IEEE 1471-2000: A Recommended Practice for Architecture Description of Software-Intensive Systems, on track to become the ISO / IEC 25961 standard, defines solution design artifacts.

Some consulting firms have developed useful proprietary architectural frameworks. Several governments and defense departments have also developed architectural frameworks, including:

4.2.1.3 The Zachman Framework for Enterprise Architecture

The Zachman Enterprise Framework2 TM is the most widely known and adopted architectural framework. Enterprise data architects, in particular, have accepted and used this framework since Zachman’s first published description of the Framework in an IBM Systems Journal article in 1987.

The Zachman Enterprise Framework2 TM, shown in Figure 4.2, has oriented the terminology towards business management, while retaining the elaborations used by the data and information systems communities. The terms for the perspective contributors (right hand row labels), the affirmation of the perspective content (left hand row labels), and the identification of the generic answers to each of the Questions (column footer labels) bring a level of clarification and understanding for each simple classification.

Figure 4.2 The Zachman Enterprise Framework2 TM

(Licensed for use by DAMA International in the DAMA-DMBOK Guide)

Modeling the enterprise architecture is a common practice within the U.S. Federal Government to inform its Capital Planning and Investment Control (CPIC) process. The Clinger-Cohen Act (CCA, or the Information Technology Management Reform Act of 1996) requires all U.S. federal agencies to have and use formal enterprise architecture.

Access to the new Enterprise Architecture Standards and the Zachman Enterprise Framework2 TM graphics is available at no cost via registration at www.ZachmanInternational.com. A Concise Definition of the Framework, written by John Zachman, is also on that site.

According to its creator, John Zachman, the Framework is a logical structure for identifying and organizing descriptive representations (models) used to manage enterprises and develop systems. In fact, the Zachman Framework is a generic classification schema of design artifacts for any complex system. The Zachman Framework is not a method defining how to create the representations of any cell. It is a structure for describing enterprises and architectural models.

To understand systems architecture, Zachman studied how the fields of building construction and aerospace engineering define complex systems, and mapped information systems artifacts against these examples. The Zachman Framework is a 6 by 6 matrix representing the intersection of two classification schemas–two dimensions of systems architecture.

In the first dimension, Zachman recognized that in creating buildings, airplanes, or systems, there are many stakeholders, and each has different perspectives about “architecture”. The planner, owner, designer, builder, implementer, and participant each have different issues to identify, understand, and resolve. Zachman depicted these perspectives as rows.

  • The planner perspective (Scope Contexts): Lists of business elements defining scope identified by Strategists as Theorists.
  • The owner perspective (Business Concepts): Semantic models of the business relationships between business elements defined by Executive Leaders as Owners.
  • The designer perspective (System Logic): Logical models detailing system requirements and unconstrained design represented by Architects as Designers.
  • The builder perspective (Technology Physics): Physical models optimizing the design for implementation for specific use under the constraints of specific technology, people, costs, and timeframes specified by Engineers as Builders.
  • The implementer perspective (Component Assemblies): A technology-specific, out-of-context view of how components are assembled and operate configured by Technicians as Implementers.
  • The participant perspective (Operations Classes): Actual functioning system instances used by Workers as Participants.

For the second dimension, each perspective’s issues required different ways to answer the fundamental questions posed by the basic interrogatives of communication: who, what, why, when, where and how. Each question required answers in different formats. Zachman depicted each fundamental question as a column.

The revised labels for each column are in parentheses:

  • What (the data column): Materials used to build the system (Inventory Sets).
  • How (the function column): Activities performed (Process Transformations).
  • Where (the network column): Locations, topography, and technology (Network Nodes).
  • Who (the people column): Roles and organizations (Organization Groups).
  • When (the time column): Events, cycles, and schedules (Time Periods).
  • Why (the goal column): Goals, strategies, and initiatives (Motivation Reasons).

Each cell in the Zachman Framework represents a unique type of design artifact, defined by the intersection of its row and column.

While the columns in the Framework are not in any order of importance, the order of the rows is significant. Within each column, the contents of each cell constrain the contents of the cells below it. The transformation from perspective to perspective ensures alignment between the intentions of enterprise owners and subsequent decisions.

Each cell describes a primitive model, limited in focus to the column’s single perspective. The granularity of detail in the Zachman Framework is a property of any individual cell regardless of the row. Depending on the need, each cell model may contain relatively little detail or an “excruciating” level of detail. The greater the integration needs, the more detail is needed in order to remove ambiguity.

No architectural framework is inherently correct or complete, and adopting any architectural framework is no guarantee of success. Some organizations and individuals adopt the Zachman Framework as a “thinking tool”, while others use it as the Engineering Quality Assurance mechanism for solutions implementation.

There are several reasons why the Zachman Framework has been so widely adopted:

  • It is relatively simple since it has only two dimensions and is easy to understand.
  • It both addresses the enterprise in a comprehensive manner, and manages architecture for individual divisions and departments.
  • It uses non-technical language to help people think and communicate more precisely.
  • It can be used to frame and help understand a wide array of issues.
  • It helps solve design problems, focusing on details without losing track of the whole.
  • It helps teach many different information systems topics.
  • It is a helpful planning tool, providing the context to guide better decisions.
  • It is independent of specific tools or methods. Any design tool or method can map to the Framework to see what the tool or method does and does NOT do.

4.2.1.4 The Zachman Framework and Enterprise Data Architecture

The enterprise data architecture is an important part of the larger enterprise architecture that includes process, business, systems, and technology architecture. Data architects focus on the enterprise data architecture, working with other enterprise architects to integrate data architecture into a comprehensive enterprise architecture.

Enterprise data architecture typically consists of three major sets of design components:

  1. An enterprise data model, identifying subject areas, business entities, the business rules governing the relationships between business entities, and at least some of the essential business data attributes.
  2. The information value chain analysis, aligning data model components (subject areas and / or business entities) with business processes and other enterprise architecture components, which may include organizations, roles, applications, goals, strategies, projects, and / or technology platforms.
  3. Related data delivery architecture, including data technology architecture, data integration architecture, data warehousing / business intelligence architecture, enterprise taxonomies for content management, and meta-data architecture.

The cells in the first “data“ column–now known as “Inventory Sets”, represent familiar data modeling and database design artifacts (see Chapter 5 for more detail).

  • Planner View (Scope Contexts): A list of subject areas and business entities.
  • Owner View (Business Concepts): Conceptual data models showing the relationships between entities.
  • Designer View (System Logic): Fully attributed and normalized logical data models.
  • Builder View (Technology Physics): Physical data models optimized for constraining technology.
  • Implementer View (Component Assemblies): Detailed representations of data structures, typically in SQL Data Definition Language (DDL).
  • Functioning Enterprise: actual implemented instances.

The Zachman Framework enables concentration on selected cells without losing sight of the “big picture.” It helps designers focus on details while still seeing the overall context, thereby building the “big picture” piece by piece.

4.2.2 Activities

The data architecture management function contains several activities related to defining the blueprint for managing data assets. An overview of each of these activities is presented in the following sections.

4.2.2.1 Understanding Enterprise Information Needs

In order to create an enterprise data architecture, the enterprise needs to first define its information needs. An enterprise data model is a way of capturing and defining enterprise information needs and data requirements. It represents a master blueprint for enterprise-wide data integration. The enterprise data model is therefore a critical input to all future systems development projects and the baseline for additional data requirements analysis and data modeling efforts undertaken at the project level.

Project conceptual and logical data models are based on the applicable portions of the enterprise data model. Some projects will benefit more from the enterprise data model than others will, depending on the project scope. Virtually every important project will benefit from, and affect, the enterprise data model.

One way of determining enterprise information needs is to evaluate the current inputs and outputs required by the organization, both from and to internal and external targets. Use actual system documentation and reports, and interview the participants. This material provides a list of important data entities, data attributes, and calculations. Organize these items by business unit and subject area. Review the list with the participants to ensure proper categorization and completeness. The list then becomes the basic requirements for an enterprise data model.

4.2.2.2 Develop and Maintain the Enterprise Data Model

Business entities are classes of real business things and concepts. Data is the set of facts we collect about business entities. Data models define these business entities and the kinds of facts (data attributes) needed about these entities to operate and guide the business. Data modeling is an analysis and design method used to:

  1. Define and analyze data requirements, and
  2. Design logical and physical data structures that support these requirements.

A data model is a set of data specifications and related diagrams that reflect data requirements and designs. An enterprise data model (EDM) is an integrated, subject-oriented data model defining the essential data produced and consumed across an entire organization.

  • Integrated means that all of the data and rules in an organization are depicted once, and fit together seamlessly. The concepts in the model fit together as the CEO sees the enterprise, not reflecting separate and limited functional or departmental views. There is only one version of the Customer entity, one Order entity, etc. Every data attribute also has a single name and definition. The data model may additionally identify common synonyms and important distinctions between different sub-types of the same common business entity.
  • Subject-oriented means the model is divided into commonly recognized subject areas that span across multiple business processes and application systems. Subject areas focus on the most essential business entities.
  • Essential means the data critical to the effective operation and decision-making of the organization. Few, if any, enterprise data models define all the data within an enterprise. Essential data requirements may or may not be common to multiple applications and projects. Multiple systems may share some data defined in the enterprise data models, but other data may be critically important, yet created and used within a single system. Over time, the enterprise data model should define all data of importance to the enterprise. The definition of essential data will change over time as the business changes; the EDM must stay up-to-date with those changes.

Data modeling is an important technique used in Data Architecture Management and Data Development. Data Development implements data architecture, extending and adapting enterprise data models to meet specific business application needs and project requirements.

4.2.2.2.1 The Enterprise Data Model

The enterprise data model is an integrated set of closely related deliverables. Most of these deliverables are generated using a data modeling tool, but no data modeling tool can create all of the potential component deliverables of a complete enterprise data model. The central repository of the enterprise data model is either a data model file or a data model repository, both created and maintained by the data-modeling tool. This model artifact is included in meta-data and is discussed in depth in Chapter 11 on Meta-data Management. Few organizations create all the component artifacts of a comprehensive enterprise data model.

An enterprise data model is a significant investment in defining and documenting an organization’s vocabulary, business rules, and business knowledge. Creating, maintaining, and enriching it require continuing investments of time and effort, even if starting with a purchased industry data model. Enterprise data modeling is the development and refinement of a common, consistent view, and an understanding of data entities, data attributes, and their relationships across the enterprise.

Organizations can purchase an enterprise data model, or build it from scratch. There are several vendors with industry standard logical data models. Most large database vendors include them as additional products. However, no purchased logical data model will be perfect out-of-the-box. Some customization is always involved.

Enterprise data models differ widely in terms of level of detail. When an organization first recognizes the need for an enterprise data model, it must make decisions regarding the time and effort that can be devoted to building it. Over time, as the needs of the enterprise demand, the scope and level of detail captured within an enterprise data model typically expands. Most successful enterprise data models are built incrementally and iteratively.

Build an enterprise data model in layers, as shown in Figure 4.3, focusing initially on the most critical business subject areas. The higher layers are the most fundamental, with lower layers dependent on the higher layers. In this respect, the enterprise data model is built top-down, although the contents of the model often benefit from bottom-up input. Such input is the result of analyzing and synthesizing the perspectives and details of existing logical and physical data models. Integrate such input into the enterprise perspective; the influence of existing models must not compromise the development of a common, shared enterprise viewpoint.

4.2.2.2.2 The Subject Area Model

The highest layer in an enterprise data model is a subject area model (SAM). The subject area model is a list of major subject areas that collectively express the essential scope of the enterprise. This list is one form of the “scope” view of data (Row 1, Column 1) in the Zachman Framework. At a more detailed level, business entities and object classes can also be depicted as lists.

There are two main ways to communicate a subject area model:

  • An outline, which organizes smaller subject areas within larger subject areas.
  • A diagram that presents and organizes the subject areas visually for easy reference.

Figure 4.3 Enterprise Data Model Layers

The selection and naming of the enterprise’s essential subject areas is critically important to the success of the entire enterprise data model. The list of enterprise subject areas becomes one of the most significant enterprise taxonomies. Organize other layers within the enterprise data model by subject area. Subject area-oriented iterations will organize the scope and priority of further incremental model development. The subject area model is “right” when it is both acceptable across all enterprise stakeholders and constituents, and useful in a practical sense as the organizing construct for data governance, data stewardship, and further enterprise data modeling.

Subject areas typically share the same name as a central business entity. Some subject areas align closely with very high-level business functions that focus on managing the information about the core business entity. Other subject areas revolve around a super-type business entity and its family of sub-types. Each subject area should have a short, one or two word name and a brief definition.

Subject areas are also important tools for data stewardship and governance. They define the scope of responsibilities for subject area-oriented data stewardship teams.

4.2.2.2.3 The Conceptual Data Model

The next lower level of the enterprise data model is the set of conceptual data model diagrams for each subject area. A conceptual data model defines business entities and the relationships between these business entities.

Business entities are the primary organizational structures in a conceptual data model. Business entities are the concepts and classes of things, people, and places that are familiar and of interest to the enterprise. The business needs data about these entities. Business entities are not named in IT language; they are named using business terms. A single example of a business entity is an instance. Keep data about instances of business entities, and make them easily recognizable.

Many business entities will appear within the scope of several subject areas. The scope boundaries of subject areas normally overlap, with some business entities included in both subject areas. For data governance and stewardship purposes, every business entity should have one primary subject area which ‘owns’ the master version of that entity.

Conceptual data model diagrams do not depict the data attributes of business entities. Conceptual data models may include many-to-many business relationships between entities. Since there are no attributes shown, conceptual data models do not attempt to normalize data.

The enterprise conceptual data model must include a glossary containing the business definitions and other meta-data associated with all business entities and their relationships. Other meta-data might include entity synonyms, instance examples, and security classifications.

A conceptual data model can foster improved business understanding and semantic reconciliation. It can serve as the framework for developing integrated information systems to support both transactional processing and business intelligence. It depicts how the enterprise sees information. See Chapter 5 for more about conceptual data modeling.

4.2.2.2.4 Enterprise Logical Data Models

Some enterprise data models also include logical data model diagrams for each subject area, adding a level of detail below the conceptual data model by depicting the essential data attributes for each entity. The enterprise logical data model identifies the data needed about each instance of a business entity. The essential data attributes included in such an enterprise data model represent common data requirements and standardized definitions for widely shared data attributes. Essential data attributes are those data attributes without which the enterprise cannot function. Determining which data attributes to include in the enterprise data model is a very subjective decision.

The enterprise logical data model diagrams continue to reflect an enterprise perspective. They are neutral and independent from any particular need, usage, and application context. Other more traditional “solution” logical data models reflect specific usage and application requirements.

Enterprise logical data models are only partially attributed. No enterprise logical data model can identify all possible data entities and data attributes. Enterprise logical data models may be normalized to some extent, but need not be as normalized as “solution” logical data models.

Enterprise logical data models should include a glossary of all business definitions and other associated meta-data about business entities and their data attributes, including data attribute domains. See Chapter 5 on Data Development for more about logical data modeling.

4.2.2.2.5 Other Enterprise Data Model Components

Some enterprise data models also include other components. These optional components might include:

  • Individual data steward responsibility assignments for subject areas, entities, attributes, and / or reference data value sets. Chapter 3 on Data Governance covers this topic in more depth.
  • Valid reference data values: controlled value sets for codes and / or labels and their business meaning. These enterprise-wide value sets are sometimes cross-referenced with departmental, divisional, or regional equivalents. Chapter 8 on Reference and Master Data Management covers this topic in more depth.
  • Additional data quality specifications and rules for essential data attributes, such as accuracy / precision requirements, currency (timeliness), integrity rules, nullability, formatting, match / merge rules, and / or audit requirements. Chapter 12 on Data Quality Management covers this topic in more depth.
  • Entity life cycles are state transition diagrams depicting the different lifecycle states of the most important entities and the trigger events that change an entity from one state to another. Entity life cycles are very useful in determining a rational set of status values (codes and / or labels) for a business entity. Section 4.2.2.5 expands on this topic.

4.2.2.3 Analyze and Align with Other Business Models

Information value-chain analysis maps the relationships between enterprise model elements and other business models. The term derives from the concept of the business value chain, introduced by Michael Porter in several books and articles on business strategy. The business value chain identifies the functions of an organization that contribute directly and indirectly to the organization’s ultimate purpose, such as commercial profit, education, etc., and arranges the directly contributing functions from left to right in a diagram based on their dependencies and event sequence. Indirect support functions appear below this arrangement. The diagram in Figure 4.4 depicts a business value chain for an insurance company.

Information value-chain matrices are composite models. While information value-chain analysis is an output of data architecture, each matrix is also part of one of business process, organization, or application architecture. In this regard, information value-chain analysis is the glue binding together the various forms of “primitive models” in enterprise architecture. Data architects, data stewards, and other enterprise architects and subject matter experts share responsibility for each matrix’s content.

4.2.2.4 Define and Maintain the Data Technology Architecture

Data technology architecture guides the selection and integration of data-related technology. Data technology architecture is both a part of the enterprise’s overall technology architecture, as well as part of its data architecture. Data technology architecture defines standard tool categories, preferred tools in each category, and technology standards and protocols for technology integration.

Figure 4.4 Example Insurance Business Value Chain

Technology categories in the data technology architecture include:

  • Database management systems (DBMS).
  • Database management utilities.
  • Data modeling and model management tools.
  • Business intelligence software for reporting and analysis.
  • Extract-transform-load (ETL), changed data capture (CDC), and other data integration tools.
  • Data quality analysis and data cleansing tools.
  • Meta-data management software, including meta-data repositories.

Technology architecture components are included in different categories:

  • Current: Products currently supported and used.
  • Deployment Period: Products deployed for use in the next 1-2 years.
  • Strategic Period: Products expected to be available for use in the next 2+ years.
  • Retirement: Products the organization has retired or intends to retire this year.
  • Preferred: Products preferred for use by most applications.
  • Containment: Products limited to use by certain applications.
  • Emerging: Products being researched and piloted for possible future deployment.

See Chapter 6 for more about managing data technologies.

4.2.2.5 Define and Maintain the Data Integration Architecture

Data integration architecture defines how data flows through all systems from beginning to end. Data integration architecture is both data architecture and application architecture, because it includes both databases and the applications that control the data flow into the system, between databases, and back out of the system. Data lineage and data flows are also names for this concept.

The relationships between the elements in each model are every bit as important as the relationships between the elements themselves. A series of two-dimensional matrices can map and document relationships between different kinds of enterprise model elements. Matrices can define the relationships to other aspects of the enterprise architecture besides business processes, such as:

  • Data related to business roles, depicting which roles have responsibility for creating, updating, deleting, and using data about which business entities (CRUD).
  • Data related to specific business organizations with these responsibilities.
  • Data related to applications that may cross business functions.
  • Data related to locations where local differences occur.

Building such matrices is a long-standing practice in enterprise modeling. IBM, in its Business Systems Planning (BSP) method, first introduced this practice. James Martin later popularized it in his Information Systems Planning (ISP) method. The practice is still valid and useful today.

The corporate information factory (CIF) concept is an example of data integration architecture. Data integration architecture generally divides data warehouses, staging databases, and data marts supporting business intelligence from the source databases, operational data stores (ODS),, master data management, and reference data / code management systems supporting online transaction processing and operational reporting. Chapter 8 on Reference and Master Data Management covers data integration architecture for reference and master data.

Data / process relationship matrices can have different levels of detail. Subject areas, business entities, or even essential data attributes can all represent data at different levels. High-level functions, mid-level activities, or low-level tasks all represent business processes.

4.2.2.6 Define and Maintain the DW / BI Architecture

Data warehouse architecture focuses on how data changes and snapshots are stored in data warehouse systems for maximum usefulness and performance. Data integration architecture shows how data moves from source systems through staging databases into data warehouses and data marts. Business intelligence architecture defines how decision support makes data available, including the selection and use of business intelligence tools. This topic is discussed in more detail in Chapter 9 on Data Warehousing and Business Intelligence Management.

4.2.2.7 Define and Maintain Enterprise Taxonomies and Namespaces

Taxonomy is the hierarchical structure used for outlining topics. The best-known example of taxonomy is the classification system for all living things developed originally by the biologist Linnaeus. The Dewey Decimal System is an example of a taxonomy for organizing and finding books in a library. Formal taxonomies are class hierarchies, while informal taxonomies are topical outlines that may not imply inheritance of characteristics from super-types.

Organizations develop their own taxonomies to organize collective thinking about topics. Taxonomies have proven particularly important in presenting and finding information on websites. Overall enterprise data architecture includes organizational taxonomies. The definition of terms used in such taxonomies should be consistent with the enterprise data model, as well as other models and ontologies.

4.2.2.8 Define and Maintain the Meta-data Architecture

Just as the data integration architecture defines how data flows across applications, the meta-data architecture defines the managed flow of meta-data. It defines how meta-data is created, integrated, controlled, and accessed. The meta-data repository is the core of any meta-data architecture. Meta-data architecture is the design for integration of meta-data across software tools, repositories, directories, glossaries, and data dictionaries. The focus of data integration architecture is to ensure the quality, integration, and effective use of reference, master, and business intelligence data. The focus of meta-data architecture is to ensure the quality, integration, and effective use of meta-data. Chapter 11 on Meta-data Management covers this topic in more detail.

4.3 Summary

Defining and maintaining data architecture is a collaborative effort requiring the active participation of data stewards and other subject matter experts, facilitated and supported by data architects and other data analysts. Data architects and analysts must work to optimize the highly valued time contributed by data stewards. The Data Management Executive must secure adequate commitment of time from the right people. Securing this commitment usually requires continual communication of the business case for data architecture and the effort required to define it.

Data architecture is a living thing that is never complete nor static. Business changes naturally drive changes to data architecture. Maintaining data architecture requires regular periodic review by data stewards. Reference to existing data architecture and relatively easy updates to data architecture can resolve many issues quickly. More significant issue resolution often requires new projects to be proposed, evaluated, approved, and performed. The outputs of these projects include updates to data architecture.

The value of data architecture is limited until data stewards participate, review, and refine data architecture, and management approves data architecture as a guiding force for systems implementation. The Data Governance Council is the ultimate sponsor and approving body for enterprise data architecture. Many organizations also form an Enterprise Architecture Council to coordinate data, process, business, system, and technology architecture.

Data architecture is just one part of overall enterprise architecture. Data architecture serves as a guide for integration. Refer to the data architecture when:

  • Defining and evaluating new information systems projects: The enterprise data architecture serves as a zoning plan for long-term integration of information systems. The enterprise data architecture affects the goals and objectives of projects, and influences the priority of the projects in the project portfolio. Enterprise data architecture also influences the scope boundaries of projects and system releases.
  • Defining project data requirements: The enterprise data architecture provides enterprise data requirements for individual projects, accelerating the identification and definition of these requirements.
  • Reviewing project data designs: Design reviews ensure that conceptual, logical, and physical data models are consistent with and contribute to the long-term implementation of the enterprise data architecture.

4.3.1 Guiding Principles

The implementation of the data architecture management function into an organization follows eight guiding principles:

  1. Data architecture is an integrated set of specification artifacts (master blueprints) used to define data requirements, guide data integration, control data assets, and align data investments with business strategy.
  2. Enterprise data architecture is part of the overall enterprise architecture, along with process architecture, business architecture, systems architecture, and technology architecture.
  3. Enterprise data architecture includes three major categories of specifications: the enterprise data model, information value chain analysis, and data delivery architecture.
  4. Enterprise data architecture is about more than just data. It helps establish the semantics of an enterprise, using a common business vocabulary.
  5. An enterprise data model is an integrated subject-oriented data model defining the essential data used across an entire organization. Build an enterprise data model in layers: a subject area overview, conceptual views of entities and relationships for each subject area, and more detailed, partially attributed views of these same subject areas.
  6. Information value-chain analysis defines the critical relationships between data, processes, roles and organizations, and other enterprise elements.
  7. Data delivery architecture defines the master blueprint for how data flows across databases and applications. This ensures data quality and integrity to support both transactional business processes and business intelligence reporting and analysis.
  8. Architectural frameworks like TOGAF and The Zachman Framework help organize collective thinking about architecture. This allows different people with different objectives and perspectives to work together to meet common interests.

4.3.2 Process Summary

The process summary for the data architecture management function is shown in Table 4.1. The deliverables, responsible roles, approving roles, and contributing roles are shown for each activity in the architecture management function. The Table is also shown in Appendix A9.

Activities

Deliverables

Responsible Roles

Approving Roles

Contributing Roles

2.1 Understand Enterprise Information Needs (P)

Lists of essential information requirements

Enterprise Data Architect, Business SME’s

Data Governance Council, Data Architecture Steering Committee, DM Executive, CIO

2.2 Develop and Maintain the Enterprise Data Model (P)

Enterprise Data Model:

  • Subject Area Model
  • Conceptual Model
  • Logical Model
  • Glossary

Enterprise Data Architect

Data Governance Council, Data Architecture Steering Committee, DM Executive, CIO

Data Architects, Data Stewards / Teams

2.3 Analyze and Align With Other Business Models (P)

Information Value Chain Analysis Matrices

  • Entity / Function
  • Entity / Org and Role
  • Entity / Application

Enterprise Data Architect

Data Governance Council, Data Architecture Steering Committee, DM Executive, CIO

Data Architects, Data Stewards / Teams,

Enterprise Architects

2.4 Define and Maintain the Data Technology Architecture (P)

Data Technology Architecture (Technology, Distribution, Usage)

Enterprise Data Architect

DM Executive, CIO,Data Architecture Steering Committee, Data Governance Council

Database Administrators, Other Data Management. Professionals

2.5 Define and Maintain the Data Integration Architecture (P)

Data Integration Architecture

  • Data Lineage / Flows
  • Entity Lifecycles

Enterprise Data Architect

DM Executive, CIO,Data Architecture Steering Committee, Data Governance Council

Database Administrators, Data Integration Specialists, Other Data Management Professionals

2.6 Define and Maintain the Data Warehouse / BI Architecture (P)

Data Warehouse / Business Intelligence Architecture

Data Warehouse Architect

Enterprise Data Architect, DM Executive, CIO,

Data Architecture Steering Committee, Data Governance Council

Business Intelligence Specialists, Data Integration Specialists, Database Administrators, Other Data Management. Professionals

2.7 Define and Maintain Enterprise Taxonomies and Namespaces

Enterprise Taxonomies,

XML Namespaces,

Content Management Standards

Enterprise Data Architect

DM Executive, CIO,
Data Architecture Steering Committee, Data Governance Council

Other Data Architects,
Other Data Management Professionals

2.8 Define and Maintain the Meta-data Architecture (P)

Meta-data Architecture

Meta-data Architect

Enterprise Data Architect,
DM Executive, CIO,
Data Architecture Steering Committee, Data Governance Council

Meta-data Specialists,
Other Data Management. Professionals

Table 4.1 Data Architecture Management Process Summary

4.3.3 Organizational and Cultural Issues

Q1: Are there any ramifications to implementing an enterprise data architecture?

A1: Implementation of enterprise data architecture can have many ramifications to an organization. First, everyone in the organization has to see the value of the overall data architecture. There will be some discovery of redundant systems and processes that may require changes to roles and responsibilities of some organization teams and individuals, so take care to discourage fear of workforce reduction. People who have been working on redundant systems become free to do interesting work on other systems. Second, everyone in the organization has to be committed to making sure that the data architecture remains current when the business needs or technology landscape change.

Implementation of an enterprise data architecture can have many ramifications to an organization’s culture. Application-centric IT shops will have to make changes to their culture to become more data-aware, and pay more attention to what is moving through their applications, rather than just to what the application does. Data awareness is a way of making IT more knowledgeable about business needs and practices, so IT then becomes more of a partner with the business, rather than just a service provider.

4.4 Recommended Reading

The references listed below provide additional reading that support the material presented in Chapter 4. These recommended readings are also included in the Bibliography at the end of the Guide.

4.4.1 Books

Bernard, Scott A. An Introduction to Enterprise Architecture, 2nd Edition. Authorhouse, 2005. ISBN 1-420-88050-0. 351 pages.

Brackett, Michael. Data Sharing Using A Common Data Architecture. New York: John Wiley & Sons, 1994. ISBN 0-471-30993-1. 478 pages.

Carbone, Jane. IT Architecture Toolkit. Prentice Hall, 2004. ISBN 0-131-47379-4. 256 pages.

Cook, Melissa. Building Enterprise Information Architectures: Re-Engineering Information Systems. Prentice Hall, 1996. ISBN 0-134-40256-1. 224 pages.

Hagan, Paula J., ed. EABOK: Guide to the (Evolving) Enterprise Architecture Body of Knowledge. MITRE Corporation, 2004. 141 pages. A U.S. federally-funded guide to enterprise architecture in the context of legislative and strategic requirements. Available for free download at

http://www.mitre.org/work/tech_papers/tech_papers_04/04_0104/04_0104.pdf

Inmon, W. H., John A. Zachman, and Jonathan G. Geiger. Data Stores, Data Warehousing and the Zachman Framework: Managing Enterprise Knowledge. McGraw-Hill, 1997. ISBN 0-070-31429-2. 358 pages.

Lankhorst, Marc. Enterprise Architecture at Work: Modeling, Communication and Analysis. Springer, 2005. ISBN 3-540-24371-2. 334 pages.

Martin, James and Joe Leben. Strategic Data Planning Methodologies, 2nd Edition. Prentice Hall, 1989. ISBN 0-13-850538-1. 328 pages.

Perks, Col and Tony Beveridge. Guide to Enterprise IT Architecture. Springer, 2002. ISBN 0-387-95132-6. 480 pages.

Ross, Jeanne W., Peter Weill, and David Robertson. Enterprise Architecture As Strategy: Creating a Foundation For Business Execution. Harvard Business School Press, 2006. ISBN 1-591-39839-8. 288 pages.

Schekkerman, Jaap. How to Survive in the Jungle of Enterprise Architecture Frameworks: Creating or Choosing an Enterprise Architecture Framework. Trafford, 2006. 224 pages. ISBN 1-412-01607-X.

Spewak, Steven and Steven C. Hill, Enterprise Architecture Planning. John Wiley & Sons -QED, 1993. ISBN 0-471-59985-9. 367 pages.

The Open Group, TOGAF: The Open Group Architecture Framework, Version 8.1 Enterprise Edition. The Open Group. (www.opengroup.org). ISBN 1-93-16245-6. 491 pages.

Zachman, John A. The Zachman Framework: A Primer for Enterprise Engineering and Manufacturing. Metadata Systems Software Inc., Toronto, Canada. eBook available only in electronic form from www.ZachmanInternational.com.

4.4.2 Articles and Websites

Zachman, John. “A Concise Definition of the Enterprise Framework.” Zachman International, 2008. Article in electronic form available for free download at http://www.zachmaninternational.com/index.php/home-article/13#thezf.

Zachman, John A. “A Framework for Information Systems Architecture”, IBM Systems Journal, Vol. 26 No. 3 1987, pages 276 to 292. IBM Publication G321-5298. Also available in a special issue of the IBM Systems Journal, “Turning Points in Computing: 1962-1999”, IBM Publication G321-0135, pages 454 to 470

http://researchweb.watson.ibm.com/journal/sj/382/zachman.pdf.

Zachman, John A. and John F. Sowa,. “Extending and Formalizing the Framework for Information Systems Architecture”, IBM Systems Journal. Vol. 31 No. 3 1992, pages 590 – 616. IBM Publication G321-5488.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset