Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

Table of Contents

Chapter 1.1: An Introduction to Data Architecture

Subdividing Data

Repetitive/Nonrepetitive Unstructured Data

The Great Divide of Data

Textual/Nontextual Data

The Different Forms of Data

Chapter 1.2: The Data Infrastructure

Two Types of Repetitive Data

Repetitive Structured Data

Repetitive Big Data

The Two Infrastructures

What's Being Optimized?

Comparing the Two Infrastructures

Chapter 1.3: The “Great Divide”

Classifying Corporate Data

The “Great Divide”

Repetitive Unstructured Data

Nonrepetitive Unstructured Data

Different Worlds

Chapter 1.4: Demographics of Corporate Data

Chapter 1.5: Corporate Data Analysis

Chapter 1.6: The Life Cycle of Data: Understanding Data Over Time

Chapter 1.7: A Brief History of Data

Paper Tape and Punch Cards

Data Base Management System (DBMS)

Coupled Processors

Online Transaction Processing

Parallel Data Management

The Great Divide

Chapter 2.1: The End-State Architecture—The “World Map”

Architectural Components

Different Kinds of Data in the End State Architecture

Shaping the Data Through Models

Where Is the Data Warehouse?

Where Different Types of Questions Are Answered Across the End State Architecture

Data in the Data Lake

Metadata in the End State Architecture

Networked Metadata

An Evolutionary Experience

The Data Lake Architecture

Chapter 3.1: Transformations in the End-State Architecture

Transformations

Customizing Data

Transforming Text

Transforming Application Data

Transforming Data Into a Customized State

Transforming Data Into Bulk Storage

Transforming Data Generated Automatically

Transforming Bulk Data

Transformation and Redundancy

Chapter 4.1: A Brief History of Big Data

An Analogy—Taking the High Ground

Taking the High Ground

Standardization With the 360

Online Transaction Processing

Enter Teradata and MPP Processing

Then Came Hadoop and Big Data

Holding the High Ground

Chapter 4.2: What Is Big Data?

Another Definition

Inexpensive Storage

The Roman Census Approach

Unstructured Data

Data in Big Data

Context in Repetitive Data

Nonrepetitive Data

Context in Nonrepetitive Data

Chapter 4.3: Parallel Processing

Chapter 4.4: Unstructured Data

Textual Information—Everywhere

Decisions Based on Structured Data

The Business Value Proposition

Repetitive and Nonrepetitive Unstructured Information

Ease of Analysis

Contextualization

Some Approaches to Contextualization

Manual Analysis

Chapter 4.5: Contextualizing Repetitive Unstructured Data

Parsing Repetitive Unstructured Data

Recasting the Output Data

Chapter 4.6: Textual Disambiguation

From Narrative Into an Analytical Data Base

Input Into Textual Disambiguation

Document Fracturing/Named Value Processing

Preprocessing a Document

E-mails—A Special Case

Report Decompilation

Chapter 4.7: Taxonomies

Data Models/Taxonomies

Applicability of Taxonomies

What Is a Taxonomy?

Taxonomies in Multiple Languages

Commercial or Private Taxonomies?

Dynamics of Taxonomies and Textual Disambiguation

Taxonomies and Textual Disambiguation—Separate Technologies

Different Types of Taxonomies

Taxonomies—Maintenance Over Time

Chapter 5.1: The Siloed Application Environment

The Challenge of Siloed Applications

Building Siloed Applications

What Does a Siloed Application Look Like?

Current Valued Data

Minimal Historical Data

High Availability

Overlap Between Siloed Applications

Frozen Business Requirements

Dismantling Siloed Applications

Chapter 6.1: Introduction to Data Vault 2.0

Data Vault Origins and Background

What Is Data Vault 2.0 Modeling?

How Is Data Vault 2.0 Methodology Defined?

Why Do We Need a Data Vault 2.0 Architecture?

Where Does Data Vault 2.0 Implementation Fit?

What Are the Business Benefits of Data Vault 2.0?

What Is Data Vault 1.0?

Chapter 6.2: Introduction to Data Vault Modeling

What Is a Data Vault Model Concept?

Data Vault Model Defined

Components of a Data Vault Model

What Makes Business Keys So Interesting?

What Does This Have to Do With Data Vault and Data Warehousing?

How Does This Translate to Data Vault Modeling?

Why Restructure the Data From the Staging Area?

What Are the Basic Rules of the Data Vault Model?

Why Do We Need Many to Many Link Structures?

Primary Key Options for Data Vault 2.0

Chapter 6.3: Introduction to Data Vault Architecture

What Is a Data Vault 2.0 Architecture?

How Does NoSQL Fit in to the Architecture?

What Are the Objectives of the Data Vault 2.0 Architecture?

What Is the Objective of the Data Vault 2.0 Model?

What Are Hard and Soft Business Rules?

How Does Managed Self Service BI Fit in the Architecture?

Chapter 6.4: Introduction to Data Vault Methodology

Data Vault 2.0 Methodology Overview

How Does CMMI Contribute to the Methodology?

If CMMI Is So Great, Why Should We Care About Agility Then?

Why Include PMP, SDLC If CMMI and Agile Should Be All That's Needed?

So Then, What Does Six Sigma Contribute to the Data Vault 2 Methodology?

Where Does TQM (Total Quality Management) Fit in to All of This?

Chapter 6.5: Introduction to Data Vault Implementation

Implementation Overview

What's So Important About Patterns?

Why Does Reengineering Happen Because of Big Data?

Why Do We Need to Virtualize Our Data Marts?

What Is Managed Self-Service BI?

Chapter 7.1: The Operational Environment: A Short History

Commercial Uses of the Computer

The First Applications

Ed Yourdon and the Structured Revolution

Disk Technology

Response Time and Availability

Corporate Computing Today

Chapter 7.2: The Standard Work Unit

Elements of Response Time

An Hourglass Analogy

The Racetrack Analogy

Your Vehicle Runs as Fast as the Vehicle in Front of It

The Standard Work Unit

Chapter 7.3: Data Modeling for the Structured Environment

The Purpose of the Roadmap

Granular Data Only

Physical Data Base Design

Relating the Different Levels of the Data Model

An Example of the Linkage

Generic Data Models

Operational Data Models/Data Warehouse Data Models

Chapter 8.1: A Brief History of Data Architecture

Chapter 8.2: Big Data/Existing System Interface

The Big Data/Existing Systems Interface

The Repetitive Raw Big Data/Existing Systems Interface

Exception Based Data

The Nonrepetitive Raw Big Data/Existing Systems Interface

Into the Existing Systems Environment

The “Context Enriched” Big Data Environment

Analyzing Structured Data/Unstructured Data Together

Chapter 8.3: The Data Warehouse/Operational Environment Interface

The Operational/Data Warehouse Interface

The Classical ETL Interface

The ODS and the ETL Interface

The Staging Area

Changed Data Capture

Inline Transformation

Chapter 8.4: Data Architecture: A High-Level Perspective

A High Level Perspective

The System of Record

Different Types of Questions

Different Communities

Chapter 9.1: Repetitive Analytics: Some Basics

Different Kinds of Analysis

Looking for Patterns

Heuristic Processing

The “Normal” Profile

Distillation, Filtering

Subsetting Data

Bias of the Sample

Repetitive Data and Context

Linking Repetitive Records

Log Tape Records

Analyzing Points of Data

Chapter 9.2: Analyzing Repetitive Data

Active/Passive Indexing of Data

Summary/Detailed Data

Metadata in Big Data

Chapter 9.3: Repetitive Analysis

Internal, External Data

Universal Identifiers

Filtering, Distillation

Archiving Results

Chapter 10.1: Nonrepetitive Data

Inline Contextualization

Taxonomy/Ontology Processing

Custom Variables

Homographic Resolution

Acronym Resolution

Negation Analysis

Numeric Tagging

Date Standardization

List Processing

Associative Word Processing

Stop Word Processing

Document Metadata

Document Classification

Proximity Analysis

Functional Sequencing Within Textual ETL

Internal Referential Integrity

Preprocessing, Postprocessing

Chapter 10.2: Mapping

Chapter 10.3: Analytics From Nonrepetitive Data

Call Center Information

Medical Records

Chapter 11.1: Operational Analytics: Response Time

Transaction Response Time

Chapter 12.1: Operational Analytics

Different Perspectives of Data

The Operational Data Store—ODS

Chapter 13.1: Personal Analytics

Chapter 14.1: Data Models Across the End-State Architecture

The Different Data Models

Functional Decomposition and Data Flow Diagrams

The Corporate Data Model

The Star Join/Dimensional Data Model

Taxonomies/Ontologies

The Selective Subdivision of Data

Proactive/Reactive Data Models

Chapter 15.1: The System of Record

The End User Cycle of Awareness

The System of Record

The System of Record in the End State Architecture

The Role of Age in the System of Record

A Simple Example

The Flow of Data in the System of Record

Other Data Than the System of Record

Is Data Updated in the System of Record?

Detailed and Summary Data in the System of Record

Auditing Data and the System of Record

Text and the System of Record

Chapter 16.1: Business Value and the End-State Architecture

The Evolution of the End State Architecture

What is Meant by “Business Value”

Tactical Business Value/Strategic Business Value

Volume of Data Versus Business Value

The “Million in One” Syndrome

Where Business Value Occurs

Data Relevancy Over Time

Where Tactical Decisions Are Made

Chapter 17.1: Managing Text

The Challenge of Text

The Challenge of Context

The Processing Components of Textual ETL

Secondary Analysis

Merging Text Based Data and Structured Data

Chapter 18.1: An Introduction to Data Visualizations

Introduction to Data Visualizations—Overview

Purpose and Context

Visualization—A Science and an Art

Visualization Framework

Step 4: Distribute

Data Visualization Tools and Software

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.