Index
Note: Page numbers followed by f indicate figures and t indicate tables.
A
Acronym resolution, nonrepetitive data
277,
277f
Aggregated data, life cycle of
34
transformations of
62,
63f
Associative word processing
281–282
Atomicity, consistency, isolation, and durability (ACID) compliance
158
B
existing system interface
211,
212f
into existing systems environment
215,
216f
structured data/unstructured data analysis
217–218
online transaction processing
69–70
Roman census approach
74–75
Teradata and MPP processing
70
unstructured data ,
75–76
Bulk data, transformations of
65,
66f
Bulk data warehouse
47,
50,
52
Business concept model, DV2
source system sequence-driven
153–155
Business requirement meetings, recording of
165
proposition, unstructured data
90–91,
90f
C
Capability Maturity Model Integration (CMMI)
163–167
Capture/edit process, data
33
Classical system development life cycle (SDLC) processing
234–235,
234f
Commercial taxonomies
115
Computer, commercial uses of
177,
178f
Conditional architecture
172
of repetitive unstructured data
99–100
Continental divide, data architecture ,
4f
diverse sources of
27,
28f
key performance indicators
31
nonrepetitive unstructured data
24,
24f
potentially business-relevant records
24
ratios of repetitive data
21,
22f
Corporate decision-making
331,
334
Corporate information infrastructure
34
Corporate information systems
33
Customer account number (CAN)
141
Customize data, transformations of
61,
61f
Custom variables resolution, nonrepetitive data
274–275
D
automated generation of
48
degradation of integrity of
37,
37f
in end-state data architecture
48–50,
49f
online transaction processing
42,
42f
paper tape and punch cards
39,
39f
parallel data management
43,
43f
standard/universal measurements of
262
into customized state
63,
64f
database management systems
203
high-level perspective
225
different communities
229
internal formatting of data
204,
205f
logical organization of data
202,
202f
nonrepetitive unstructured environment
206,
206f
online database environment
208,
208f
parent-child relationship and net-worked relationship
203,
203f
relational database management system
203,
204f
repetitive/nonrepetitive unstructured data
2–3
Data communications (DC)
69
different infrastructures
10
repetitive structured data
7–8
operational/data warehouse
198
for structured environment
191
operational/data warehouse data models
198
Data organization, visualizations
387
Data quality, visualizations
387–388
Data sources, visualizations
386–387
Data Vault 1.0 modeling
134
hard and soft business rules
160–161
multipart source business keys
155–156
source system sequence-driven business keys
153–155
many-to-many link structures
147–148
Data warehouse
43,
43f,
47,
50–52,
51f,
208–209,
209f,
321,
343,
344f,
356,
363
operational environment interface
219,
219f
Data warehouse data models
198
Disciplined agile delivery (DAD)
166
E
networked metadata
55,
56f
shaping through models
50,
51f
data into customized state
63,
64f
Enterprise data warehousing (EDW)
139
Enterprise information integration (EII)
161,
175
Entity relationship diagram (ERD)
192–193
F
False-positive correlation
233,
233f
Federated query engines
161
Formal analysis, corporate data analysis
27
Frozen business, requirements of
130,
130f
Functional sequencing, within textual ETL
286,
286f
G
Gartner Group, big data definition
73
corporate data classification
13
nonrepetitive unstructured data
16–18
repetitive unstructured data
14–15
Greenwich mean time (GMT)
262
H
Hadoop distributed file system (HDFS)
158
Hard and soft business rules
160–161
High-level perspective, data architecture
225
different communities
229
Homographic resolution, nonrepetitive data
275–276
I
Inexpensive storage, big data
74
Informal analysis, corporate data analysis
27
Information management system (IMS)
69
Inline contextualization, nonrepetitive data
272–273
Internal referential integrity, nonrepetitive data
287,
287f
International Business Machines (IBM)
37
J
K
corporate data analysis
31
L
Least squares approach
246
List processing, nonrepetitive data
280–281
M
Manual analysis, unstructured data
96–97,
97f
Many-to-many link structures
147–148
repetitive and nonrepetitive records of data
291
variables name selection
292
Massively parallel processing (MPP) approach
70,
83–84,
83f
in end-state data architecture
54,
55f
Metrics, repetitive analysis
267–268
“Million in one” syndrome
366
Multipart source business keys
155–156
Multiple processors
41–42
N
Narrative data, classification of
112–113
Narrative information
304
Natural language processing (NLP)
17,
95,
374
Negation analysis, nonrepetitive data
277–278
Networked metadata
55,
56f
associative word processing
281–282
functional sequencing within textual ETL
286,
286f
internal referential integrity
287,
287f
preprocessing and postprocessing
287–289
taxonomy/ontology processing
273–274
Nonrepetitive nontextual data
Nonrepetitive records of data
291
business relevancy of
24,
24f
NoSQL platform, Data Vault 2.0 architecture
158–159
Numeric tagging, nonrepetitive data
278–279
O
Online database environment
208,
208f
Online real-time system
42,
43f
processing, nonrepetitive data
273–274
Open-ended continuous analysis
231,
232f
Operational analytics
319
Operational data models
198
Operational environment
177
commercial uses of computer
177,
178f
Ed Yourdon and structured revolution
178–179
interface, data warehouse
219,
219f
Optical character recognition (OCR) software
27–28
P
Parallel data management
43,
43f
in Roman census approach
81,
82f,
84
big data handling
81,
82f
of repetitive data
85,
85f
Parent-child relationship
203,
203f
of repetitive unstructured data
99,
100f
Performance, operational environment
309,
310f
Personal decision-making
331,
334
Postprocessing, nonrepetitive data
287–289
Potentially business-relevant records
24
Preprocessing, nonrepetitive data
287–289
Probabilistic linkages
260
Project management professional (PMP)
167–168
Proximity analysis, nonrepetitive data
285–286
Q
Queue time, transaction response time
313,
313f
R
high-level perspective, data architecture
225–226
transformations and
66,
66f
Reengineering, DV2 implementation
172–174
Relational database management system
203,
204f
Relational model, operational analytics
322,
322f
filtering and distillation processing
265–266
internal, external data
261
universal identifiers
262
repetitive data and context
243–244
active/passive indexing of data
255–256
application-specific nature of
87,
87f
contextual data on
86,
86f
Repetitive structured data
7–8
Repetitive unstructured information
91,
91f
Roman census approach ,
74–75
S
Scale-free network design
142
Service-level agreement (SLA)
188–189
frozen business requirements
130,
130f
Source system sequence-driven business keys
153–155
Standard data warehouse
52
Standard structured DBMS
11f,
12
Standard work unit (swu)
185,
188
response time, elements of
185,
186f
Structured approach, data architecture
204,
205f
merging text based data and
378–379
repetitive structured data
7–8
System development life cycle (SDLC)
179,
179f
detailed and summary data
359,
359f
end-state architecture
355
T
and textual disambiguation
Taxonomy processing, nonrepetitive data
273–274
merging text based and structured data
378–379
Textual disambiguation
16–18,
17–18f,
47,
79,
85–86,
101,
216,
217f,
262,
270,
271f,
374–375,
376f
document fracturing/named value processing
105–106
flow of processing in
271
from narrative to analytical data base
101,
102f
associative word processing
281–282
functional sequencing within
286,
286f
internal referential integrity
287,
287f
preprocessing and postprocessing
287–289
Textual information
89,
89f
Total cost of ownership (TCO)
144
Total quality management (TQM)
169–170
Transaction response time
310,
311f
U
Uniprocessor architecture
41
Universal identifiers
262
Unstructured approach, data architecture
204,
205f
repetitive and nonrepetitive unstructured information
91,
91f
textual information
89,
89f
V
W
Word stemming, nonrepetitive data
283
Y