Index
Symbols
3NF (third normal form) models ch1¶21
4-step dimensional design process ch2¶11, ch3¶5
A
abnormal scenario indicators ch8¶97
abstract generic dimensions ch2¶249
accessibility goals ch1¶303
accidents (insurance case study), factless fact tables ch16¶77
accounting case study ch7¶3
accumulating grain fact tables ch1¶40
accumulating snapshots ch2¶53, ch4¶30, ch6¶89
activity-based costing measures ch6¶60
add mini dimension and type 1 outrigger (SCD type 5) ch2¶150
add mini-dimension (SCD type 4) ch2¶147
add new attribute (SCD type 3) ch2¶144, ch5¶51, ch5¶62
add new row (SCD type 2) ch2¶140, ch5¶36
addresses
add type 1 attributes to type 2 dimension (SCD type 6) ch2¶153
admissions events (education case study) ch13¶17
aggregate builder, ETL system ch19¶157
aggregated facts
aggregate fact tables ch2¶59
aggregate OLAP cubes ch1¶27, ch2¶59
aggregate tables, ETL system development ch20¶114
agile development ch1¶132
airline case study ch12¶2
aliasing ch6¶14
allocated facts ch2¶188
allocating ch6¶59
allocations, profit and loss fact tables ch2¶191
ALTER TABLE command ch1¶59
analytics
analytic solutions, packaged ch9¶28
AND queries, skill keywords bridge ch9¶59
architecture
artificial keys ch3¶116
ASCII (American Standard Code for Information Interchange) ch8¶26
atomic grain data ch1¶59, ch3¶25
attributes
audit columns, CDC (change data capture) ch19¶43
audit dimensions ch2¶252, ch6¶85
automation, ETL system development
B
backflow, big data and ch21¶35
backups ch19¶305
backup system, ETL systems ch19¶171
banking case study ch10¶4
behavior
behavior tags
BI application design/development (Lifecycle) ch17¶98, ch17¶211
BI applications ch1¶80
BI (business intelligence) delivery interfaces ch19¶27
big data
blobs ch21¶10
boundary crashes, big data and ch21¶38
bridge tables
bubble chart, dimension modeling and ch18¶31
budget fact table ch7¶39
budgeting process ch7¶36
bus architecture ch4¶53
business analyst ch17¶201
Business Dimensional Lifecycle ch17¶5
business-driven governance ch4¶91
business driver ch17¶202
business initiatives ch3¶7
business lead ch17¶203
business motivation, Lifecycle planning ch17¶19
business processes
business representatives, dimensional modeling ch18¶12
business requirements
business rule screens ch19¶69
business sponsor ch17¶18
business users ch17¶204
bus matrix
C
calculation lag ch6¶97
calendar date dimensions ch2¶93
calendars, country-specific as outriggers ch12¶37
cannibalization ch3¶101
cargo shipper schema ch12¶27
case studies
causal dimension ch3¶85, ch10¶200
CDC (change data capture)
centipede fact tables ch2¶176, ch3¶144
change reasons ch9¶15
change tracking ch5¶19, ch5¶23
chart of accounts (G/L) ch7¶9
checkpoints, data quality ch20¶95
CIF (Corporate Information Factory) ch1¶107, ch1¶110
CIO (chief information officer) ch16¶8
claim transactions (insurance case study) ch16¶60
class of service flown dimension (airline case study) ch12¶30
cleaning and conforming, ETL systems ch19¶201
clickstream data ch15¶1
collaborative design workshops ch2¶8
column screens ch19¶67
comments, survey questionnaire (HR) ch9¶74
common dimensions ch4¶76
compliance, ETL system ch19¶11
compliance manager, ETL system ch19¶26
composite keys ch1¶42
computer resources, big data and ch21¶44
conformed dimensions ch2¶109, ch4¶76, ch11¶33
conformed facts ch2¶44, ch4¶14
conforming system, ETL system ch19¶84
consistency
consolidated fact tables ch2¶160
contacts, bridge tables ch8¶68
contribution amount (P&L statement) ch6¶201
correctly weighted reports ch10¶22
cost, activity-based costing measures ch6¶60
COUNT DISTINCT ch8¶49
country-specific calendars as outriggers ch12¶37
course registrations (education case study) ch13¶18
CRM (customer relationship management) ch8¶1
currency, multiple
current date attributes, dimension tables ch3¶56
customer contacts, bridge tables ch8¶68
customer dimension ch6¶23, ch6¶24, ch6¶29, ch6¶34
customer matching ch8¶14
customer relationship management. case study. See CRM ch8¶5
D
data architect/modeler ch17¶206
data bags ch21¶9
database administrator ch17¶227
data cleansing system, ETL system ch19¶62
data compression, ETL system ch19¶55
data governance ch4¶89
data handlers, late arriving ch19¶148
data highway planning ch21¶27
data integration
data latency, ETL system ch19¶21
data mart, independent data mart architecture ch1¶103, ch1¶104, ch1¶105, ch1¶106
data mining
data modeling, big data best practices
data models, packaged ch9¶28
data profiling
data propagation, ETL system ch19¶162
data quality
data steward ch17¶210
data structure, analysis time ch21¶54
data value, big data and ch21¶34
data virtualization, big data and ch21¶58
data warehousing versus operational processing ch1¶8
date dimension ch3¶45, ch11¶25
dates
date/time
date/time dimensions ch19¶115
date/time stamp dimensions ch10¶201
deal dimensions ch6¶36
decision-making goals ch1¶201
decodes, dimensions ch11¶30
decoding production codes ch20¶32
deduplication system ch19¶80
degenerate dimension ch2¶78, ch10¶202, ch11¶27
demand planning ch5¶6
demographics dimension ch10¶35
denormalized flattened dimensions ch2¶81
dependency analysis ch19¶315
dependency, ETL ch19¶195
deployment
derived facts ch3¶38
descriptions, dimensions ch11¶30
descriptive context, dimensions for ch2¶21
destination airport dimension (airline case study) ch12¶33
detailed implementation bus matrix ch2¶127, ch16¶57
detailed table design documentation ch18¶44
diagnosis dimension (healthcare case study) ch14¶24
diff compare, CDC (change data capture) ch19¶45
dimensional modeling ch1¶18
dimensional thinking, big data and ch21¶47
dimension manager system ch19¶154
dimensions
dimension surrogate keys ch2¶69
dimension tables ch1¶44
dimension terminology ch1¶54
dimension-to-dimension table joins ch2¶213
documentation
draft design
dual date/time stamps ch8¶92
dual type 1 and type 2 dimensions (SCD type 7) ch2¶156
duplication, deduplication system ch19¶80
durable keys ch2¶72
DW/BI ch1¶1
dynamic value bands ch2¶231, ch10¶39
E
ecosystems, big data and ch21¶32
education
effective date, SCD type 2 ch5¶36
EHR (electronic health record) ch14¶6
electronic commerce case study ch15¶1
embedded manager's key (HR) ch9¶41
embedding attribute meaning ch3¶67
employee hierarchies, recursive ch9¶37
employee profiles ch9¶3
EMRs (electronic medical records), healthcare case study ch14¶6, ch14¶36
enterprise data warehouse bus architecture ch1¶77, ch2¶121, ch4¶52
enterprise data warehouse bus matrix ch2¶124, ch4¶59
ERDs (entity-relationship diagrams) ch1¶22
error event schema, ETL system ch19¶71
error event schemas ch2¶265
ETL (extract, transformation, and load) system ch1¶66, ch19¶1
event dimension, clickstream data ch15¶26
expiration date, type 2 SCD ch5¶44
extended allowance amount (P&L statement) ch6¶205
extended discount amount (P&L statement) ch6¶216
extended distribution cost (P&L statement) ch6¶207
extended fixed manufacturing cost (P&L statement) ch6¶208
extended gross amount (P&L statement) ch6¶209
extended net amount (P&L statement) ch6¶210
extended storage cost (P&L statement) ch6¶211
extended variable manufacturing cost (P&L statement) ch6¶212
extensibility in dimensional modeling ch1¶58
extracting, ETL systems ch19¶215
extraction ch1¶67
extract system, ETL system ch19¶48
F
fact extractors ch21¶11
factless fact tables ch2¶56, ch3¶112, ch6¶34
fact provider system
facts ch1¶31, ch1¶39, ch3¶15, ch3¶43
fact-to-fact joins, avoiding with multipass SQL ch2¶203
feasibility in Lifecycle planning ch17¶20
financial services case study ch10¶2
financial statements (G/L) ch7¶34
fiscal calendar, G/L (general ledger) ch7¶31
fixed depth position hierarchies ch2¶160, ch7¶55
fixed time series buckets, date dimensions and ch11¶26
FK (foreign keys). See foreign keys (FK) ch1¶41
flags
flattened dimensions, denormalized ch2¶81
flexible access to information ch17¶23
foreign keys (FK)
forum, Lifecycle business requirements ch17¶35
frequent shopper program, retail sales schema ch3¶108
FROM clause ch1¶62
G
GA (Google Analytics) ch15¶57
general ledger. See G/L (general ledger) ch7¶7
generic dimensions, abstract ch2¶249
geographic location dimension ch11¶52
G/L (general ledger) ch7¶7
GMT (Greenwich Mean Time) ch12¶40
goals of DW/BI ch1¶9
Google Analytics (GA) ch15¶57
governance
grain ch2¶18
granularity ch11¶15
GROUP BY clause ch1¶62
growth
H
Hadoop, MapReduce/Hadoop ch21¶14
HCPCS (Healthcare Common Procedure Coding System) ch14¶9
HDFS (Hadoop distributed file system) ch21¶12
headcount periodic snapshot ch9¶23
header/line fact tables ch2¶185
header/line patterns ch6¶51, ch6¶65
healthcare case study ch14¶3
heterogeneous products ch10¶43
hierarchies
high performance backup ch19¶302
HIPAA (Health Insurance Portability and Accountability Act) ch14¶8
historic fact tables
historic load data, ETL development ch20¶25
holiday indicator ch3¶54
hot response cache ch8¶201
hot swappable dimensions ch2¶246, ch10¶52
household dimension ch10¶17
HR (human resources) case study ch9¶1
HTTP (Hyper Text Transfer Protocol) ch15¶16
hub-and-spoke CIF architecture ch1¶107, ch1¶110
hub-and-spoke Kimball hybrid architecture ch1¶112
human resources management case study. See HR (human resources) ch9¶1
hybrid hub-and-spoke Kimball architecture ch1¶112
hybrid techniques, SCDs ch2¶150
hyperstructured data ch21¶9
I
ICD (International Classification of Diseases) ch14¶9
identical conformed dimensions ch4¶80
images, healthcare case study ch14¶44
impact reports ch10¶24
incremental processing, ETL system development ch20¶77
in-database analytics, big data and ch21¶45
independent data mart architecture ch1¶103, ch1¶104, ch1¶105, ch1¶106
indicators
Inmon, Bill ch1¶107
insurance case study ch16¶2
integer keys ch3¶116
integration
international names/addresses, customer dimension ch8¶26
interviews, Lifecycle business requirements ch17¶48
inventory case study ch4¶6
inventory, healthcare case study ch14¶48
invoice transaction fact table ch6¶67
J
job scheduler, ETL systems ch19¶166
job scheduling, ETL operation and automation ch20¶116
joins
journal entries (G/L) ch7¶24
junk dimensions ch2¶99, ch6¶42
justification for program/project planning ch17¶21
K
keys
keywords, skill keywords ch9¶54
Kimball Dimensional Modeling Techniques. See dimensional modeling
Kimball DW/BI architecture ch1¶64
Kimball Lifecycle ch17¶5
KPIs (key performance indicators) ch4¶101
L
lag calculations ch6¶97
lag/duration facts ch2¶182
late arriving data handler, ETL system ch19¶148
late arriving dimensions ch2¶255
late arriving facts ch2¶209
launch, Lifecycle business requirements ch17¶47
Law of Too ch17¶21
legacy environments, big data management ch21¶22
legacy licenses, ETL system ch19¶33
Lifecycle
lift, promotion ch3¶301
lights-out operations, backup ch19¶304
limited conformed dimensions ch4¶88
lineage analysis ch19¶306
lineage, ETL system ch19¶23, ch19¶195
loading fact tables, incremental ch20¶100
localization ch12¶43
location, geographic location dimension ch11¶52
log scraping, CDC (change data capture) ch19¶46
low cardinality dimensions, insurance case study ch16¶31
low latency data, CRM and ch8¶115
M
maintenance, Lifecycle ch17¶19
management
management best practices, big data ch21¶19, ch21¶22, ch21¶25
management hierarchies, drilling up/down ch9¶45
managers, publishing metaphor ch1¶12
many-to-one hierarchies ch3¶62
many-to-one relationships ch6¶29
many-to-one-to-many joins ch8¶111
MapReduce/Hadoop ch21¶14
market growth ch3¶202
master dimensions ch4¶76
MDM (master data management) ch4¶95, ch8¶102, ch19¶18
meaningless keys ch3¶116
measurement, multiple ch2¶197
measure type dimension ch2¶240
message queue monitoring, CDC (change data capture) ch19¶47
metadata coordinator ch17¶228
metadata repository, ETL system ch19¶215
migration, version migration system, ETL ch19¶187
milestones, accumulating snapshots ch4¶42
mini-dimension and type 1 outrigger (SCD type 5) ch5¶74
mini-dimensions ch10¶28
modeling
multipass SQL, avoiding fact-to-fact table joins ch2¶203
multiple customer dimension, partial conformity ch8¶105
multiple units of measure ch2¶197, ch6¶99
multivalued bridge tables ch2¶219
multivalued dimensions
myths about dimensional modeling ch1¶115
N
names
name-value pairs ch21¶56
naming conventions ch18¶22
natural keys ch2¶72, ch3¶116, ch3¶122, ch5¶83
NCOA (national change of address) ch8¶103
nodes (hierarchies) ch7¶58
non-additive facts ch2¶38, ch3¶39
non-natural keys ch3¶116
normalization ch1¶109, ch11¶20
normalized 3NF structures ch1¶23
null attributes ch2¶90
null fact values ch20¶57
null values
number attributes, insurance case study ch16¶29
numeric facts ch1¶35
numeric values
O
off-invoice allowance (P&L) statement ch6¶205
OLAP (online analytical processing) cube ch1¶27, ch2¶28
one-to-one relationships ch6¶29
operational processing versus data warehousing ch1¶8
operational product master, product dimensions ch6¶22
operational source systems ch1¶65
operational system users ch1¶6
opportunity/stakeholder matrix ch2¶130, ch4¶69
order management case study ch6¶1
order number, degenerate dimensions ch6¶38
order management case study, role playing ch6¶19
origin dimension (airline case study) ch12¶33
OR, skill keywords bridge ch9¶59
outrigger dimensions ch2¶105, ch3¶84, ch3¶140
overwrite (type 1 SCD) ch2¶137, ch5¶26
P
packaged analytic solutions ch9¶28
packaged data models ch9¶28
page dimension, clickstream data ch15¶21
page event fact table, clickstream data ch15¶41
parallelizing/pipelining system ch19¶202
parallel processing, fact tables ch20¶107
parallel structures, fact tables ch20¶109
parent/child schemas ch2¶185
parent/child tree structure hierarchy ch7¶60
partitioning
passenger dimension, airline case study ch12¶14
pathstring, ragged/variable depth hierarches ch2¶169
pay-in-advance facts, insurance case study ch16¶44
payment method, retail sales ch3¶99
performance measurement, fact tables ch1¶31, ch1¶39
period close (G/L) ch7¶15
periodic snapshots ch2¶50, ch4¶6, ch4¶37
perspectives of business users ch10¶44
physical design, Lifecycle data track ch17¶87
pipelining system ch19¶202
planning, demand planning ch5¶6
P&L (profit and loss) statement
policy transactions (insurance case study) ch16¶16
PO (purchase orders) ch5¶6
POS (point-of-sale) system ch3¶18
presentation area ch1¶73
prioritization, Lifecycle business requirements ch17¶60
privacy, data governance and ch21¶62
problem escalation system ch19¶197
procurement case study ch5¶7, ch5¶18
product dimension ch3¶61
production codes, decoding ch20¶32
products
profit and loss facts ch6¶78, ch15¶65
program/project planning (Lifecycle) ch17¶8
project manager ch17¶218
promotion dimension ch3¶85
promotion lift ch3¶301
prototypes
publishing metaphor for DW/BI managers ch1¶12
Q
quality events, responses ch19¶70
quality screens, ETL systems ch19¶65
questionnaire, HR (human resources) ch9¶67
R
ragged hierarchies
rapidly changing monster dimension ch2¶147
RDBMS (relational database management system) ch2¶28
real-time fact tables ch2¶262
real-time processing ch20¶121
rearview mirror metrics ch6¶105
recovery and restart system, ETL system ch19¶179
recursive hierarchies, employees ch9¶37
reference dimensions ch4¶76
referential integrity ch1¶41
referral dimension, clickstream data ch15¶29
relationships
relative date attributes ch3¶56
remodeling existing data structures ch11¶47
reports
requirements for dimensional modeling ch18¶16
restaurant metaphor for Kimball architecture ch1¶82
retail sales case study ch3¶16, ch3¶98
retain original (SCD type 0) ch2¶134, ch5¶24
retrieval ch19¶174
retroactive changes, healthcare case study ch14¶49
reviewing dimensional model ch18¶50
RFI measures ch8¶38
RFP (request for proposal) ch17¶81
role playing, dimensions ch2¶96, ch3¶84, ch6¶18, ch10¶245
S
sales channel dimension, airline case study ch12¶18
sales reps, factless fact tables ch6¶34
sales transactions, web profitability and ch15¶65
sandbox results, big data management ch21¶23
sandbox source system, ETL development ch20¶23
satisfaction indicators in fact tables ch8¶93
scalability, dimensional modeling myths ch1¶119
SCDs (slowly changing dimensions) ch2¶133, ch5¶21, ch19¶91
scheduling jobs, ETL operation and automation ch20¶116
scoping for program/project planning ch17¶21
scoring, CRM and customer dimension ch8¶37
screening
security ch19¶230
segmentation, CRM and customer dimension ch8¶36
segments, airline bus matrix granularity ch12¶9
SELECT statement ch1¶62
semi-additive facts ch2¶38, ch4¶13
sequential behavior, step dimension ch2¶243, ch8¶78
sequential integers, surrogate keys ch3¶125
service level performance ch6¶74
session dimension, clickstream data ch15¶27
session fact table, clickstream data ch15¶30
session IDs, clickstream data ch15¶15
set difference ch3¶114
shared dimensions ch4¶76
shipment invoice fact table ch6¶73
shrunken dimensions ch2¶112
simple administration backup ch19¶303
simple data transformation, dimensions ch20¶29
single customer dimension, data integration and ch8¶101, ch8¶104
single granularity, facts and ch11¶17
single version of the truth ch17¶23
skill keywords ch9¶54
skills, ETL system ch19¶30
SKUs (stock keeping units) ch3¶17
slightly ragged/variable depth hierarchies ch2¶163
slowly changing dimensions. See SCDs ch5¶21
smart keys
snapshots
snowflaking ch1¶53, ch2¶102, ch3¶133, ch19¶113
social media, CRM (customer relationship management) and ch8¶8
sorting
source systems, operational ch1¶65
special dimensions manager, ETL systems ch19¶114
specification document, ETL development ch20¶19
SQL multipass to avoid fact-to-fact table joins ch2¶203
staffing for program/project planning ch17¶26
star joins ch1¶55
static dimensions
statistics, historic fact table audit ch20¶53
status dimensions ch10¶203
step dimension ch2¶243
stewardship ch4¶89
storage, Lifecycle data ch17¶93
store dimension ch3¶76
strategic business initiatives ch3¶7
streaming data, big data and ch21¶37
strings, skill keywords ch9¶61
structure screens ch19¶68
student dimension (education case study) ch13¶20
study groups, behavior ch2¶225
subsets, shrunken subset dimensions ch19¶120
subtypes ch10¶45
summary data, dimensional modeling and ch1¶116
sunsetting, big data management ch21¶25
supernatural keys ch2¶72, ch3¶122
supertypes
surrogate keys ch2¶173, ch3¶119, ch11¶29
survey questionnaire (HR) ch9¶67
synthetic keys ch3¶116
T
tags, behavior, in time series ch2¶222
team building, Lifecycle business requirements ch17¶39
technical application design/development (Lifecycle) ch17¶12
technical architect ch17¶217
technical architecture (Lifecycle) ch17¶10, ch17¶64
telecommunications case study ch11¶4
term dimension (education case study) ch13¶19
text comments
text strings, skill keywords ch9¶61
text, survey questionnaire (HR) comments ch9¶74
textual attributes, dimension tables ch3¶55
textual facts ch1¶38
The Data Warehouse Toolkit (Kimball) ch1¶8, ch3¶46
third normal form (3NF) models ch1¶21
time
timed extracts, CDC (change data capture) ch19¶44
time dimension ch3¶46
timeliness goals ch1¶210
time-of-day
time series
time shifting ch3¶211
timespan fact tables ch8¶85
timespan tracking in fact tables ch2¶206
time varying multivalued bridge tables ch2¶219
time zones
tools
transactions ch2¶47, ch4¶34, ch6¶43
transportation ch12¶1
travel services flight schema ch12¶28
trees (hierarchies) ch7¶58
type 1 (overwrite) SCD ch2¶137
type 2 (add new row) SCD ch2¶140, ch5¶36, ch5¶44, ch5¶50
type 3 (add new attribute) SCD ch2¶144, ch5¶51, ch5¶61
type 4 (add mini-dimension) SCD ch2¶147
type 5 (add mini-dimension and type 1 outrigger) SCD ch2¶150
type 5 (add mini-dimension and type outrigger) SCD ch5¶74
type 6 (add type 1 attributes to type 2 dimension) SCD ch2¶153, ch5¶76
type 7 (dual type 1 and type 2 dimension) SCD ch2¶156, ch5¶82, ch5¶88
U
Unicode ch8¶27
uniform chart of accounts ch7¶13
units of measure, multiple ch6¶99
updates, accumulating snapshots ch4¶44
user-maintained dimensions, ETL systems ch19¶123
UTC (Coordinated Universal Time) ch12¶40
V
validating dimension model ch18¶50
validation, relationships ch20¶33
value band reporting ch10¶39
value chain ch2¶118
variable depth hierarchies
version control ch19¶308
version migration system, ETL system ch19¶187
visitor identification, web sites ch15¶18
W
weekday indicator ch3¶55
WHERE clause ch1¶62
workflow monitor, ETL system ch19¶189
workshops, dimensional modeling ch2¶8
X-Y-Z
YTD (year-to-date) facts ch2¶200