
Note: Page numbers followed by “f” and “t” refer to figures and tables, respectively.


Active data warehousing, 95
Admission and acceptance, analytics solution for, 52
Adoption roadmap, of analytics, 113
data warehousing success, learning from, 113–117
data management, 115–117
efficient data acquisition, 115
evangelize, 114–115
holistic view, 115
quick results, 114
simplification, 113–114
ADP, 130
Aggregate data, 72–73
Aggregate variables, 182
Algorithm versus analytics model, 8–9
Analysis, categories of, 153–158
audit and control, 158
problem statement and goal, 153
profiling and data, 153–156
semantic profiling, 156
syntactic profiling, 154–156
model and decision strategy, 156–157
operational integration, 157–158
Analytical and reporting systems
requirements gathering process for, 132–133
Analytical applications, 33–35
challenges, 35
implementation, 34
Analytical controls, 108
Analytics datamart, 71–72, 161, 176–183
Analytics models, 35–36, 102, 143, 164, 172–173
versus algorithm model, 8–9
challenges, 36
implementation, 36
Analytics variables, 159–161
Application, of analytics
in consumer risk, 48–49
in customer relation management, 44–46
in energy and utilities, 54–57
in fraud detection, 57–58
in healthcare, 42–44
in higher education, 51–52
in human resource, 46–48
in insurance, 49–51
on manpower and skill, 58–60
in manufacturing, 52–54
problems pattern, 58–60
in telecommunication, 51
Atomic data, 72–73
Audit and control, 101
analysis of, 158
audit datamart, 104–106
control definition, 106–108
analytical controls, 108
best-practice controls, 107
expert controls, 107–108
design of, 163
organization and process, 103–104
framework, 103–109
model execution, 183
reporting and action, 108–109
Automated decisions, 101–103, 163
See also Decision automation
analytics and, 101–103
monitoring layer, 102–103
risk of, 102


Base analytics data, 177–180
Base variables, 159–160
Benefits fraud
analytics solution for, 57–58
Best-practice controls, 107
analytical problems, 192
definition of, 185
implementation challenge, 187–189
controlling size, 188
Information Continuum, applying the, 188
variety in, 187
velocity of, 186–187
volume of, 187
The Black Swan, 86
Borrower default, analytics solution for, 49
Business analytics, xvii–xviii, 106
Business intelligence (BI), 3–4
Business intelligence competency center (BICC), 170–171
analytics, 172–174
analyst, 175
architect, 175
decision strategy, 174
modeling, 172–173
specialist, 175
technology, 173–174
business analysts, 172
data architecture, 171–172
ETL (extract, transform, load), 170–171
information delivery, 174–175
skills summary, 175–176
organization chart, 168–170
roles and responsibilities, 170–175
Business need, 25
Business process innovation, 98–99
Business process integration requirements, 144–145
Business rules
in business operations, 87–88
and decision automation, 88
of experts, 87–88
quantitative, 88
Business value perspective, 5–6, 6f


Centralized approach, of analytics, 148
Champion–challenger strategies, 82–83, 98–99
business process innovation, 98–99
Characteristics, definition of, 64t
Classification methods, 15–16
Cloud computing, 193–194
analytics in, 194
disintegration in, 193–194
Clustering, 11–13, 44, 74f, 92–94
Columns, 64t
Commoditization, xvii, 196
Computing machines, 129–130
Confusion matrix, 80, 126t
Consumer risk, 48–49
analytics solution, for borrower default, 49
Continuous variables, 69–72
Counts and lists, of Information Continuum, 27–28
challenges, 28
implementation, 28
Credit card fraud
analytics solution for, 58
Customer relationship management, 44–46
analytic solution
customer segmentation, 44–45
propensity to buy, 45–46


Dashboards, 32–33, 32f
Data, xv
Data discovery, 41, 137
Data entity, 134
Data fields, 64t, 66t, 154
Data integration, 24, 34, 115, 122–123, 171–172, 193
Data lineage, 171
Data management, 24, 115–117, 123–124, 171–172
Data mining/machine learning, 15–17
Data modeling, 122
Data profiling, 153–156
semantic profiling, 156
syntactic profiling, 154–156
Data requirements, 139–142
Data scientists, xvii–xviii
Data silo, creating, 140–141
Data sources, 189–190
Data visualization, 6–7, 33
Data warehousing, 3–4, 24, 29
and analytics, 116–117
building on, 149–151
business problem for analytics project, 117–118
data and value, 125–127
data modeling, 122
database management and query tuning, 124
ETL design, development, and execution, 122–123
existing skills, 121–125
existing technology, 120–121
job scheduling and error handling, 124
management attention and champion, 118–119
metadata management and data governance, 123–124
problem statement, 125
project, 119–125
reporting and analysis, 124
results, 125–127
roadshow, 125–127
source system analysis, 122
wider adoption, 125–127
capabilities, 150f
confusion matrix, 126t
data management, 115–117
efficient data acquisition, 115
evangelize, 114–115
extension, 158–159
holistic view, 115
industry, 3–4
in IT departments, 113–117
quick results, 114
simplification, 113–114
Database management and query tuning, 124
Datamart-based approach, 114
Decentralized approach, of analytics, 149
Decision automation
and intelligent systems, 94–97
ETL to rescue, 96–97
learning versus applying, 94–96
strategy integration methods, 96–97
Decision optimization, 18–20
Decision strategy, 36–38, 85–94, 133, 157, 161–162
business rules in business operations, 87–88
expert business rules, 87–88
quantitative business rules, 88
challenges, 38
cutoffs, 91
decision automation and business rules, 88
in descriptive models, 92–94
evaluation, 97–98
implementation, 38
insurance claims, 91–92
joint business and analytics sessions, 89
requirements for, 143
analytics and, 133
retail bank, 89–91
variables, 91
Decision trees, 16, 63
Decision variables, 161
Decision-making, xv
Defining analytics, 3
challenge of, 4–7
business value perspective, 5–6
technical implementation perspective, 6–7
hype, 3–4
techniques, 7–20
algorithm versus analytics model, 8–9
decision optimization, 18–20
descriptive analytics, 11–13
forecasting, 9–11
predictive analytics, 13–18
Democratization, xvii–xviii, 196–197
Deployment of analytics, 165
Descriptive analytics, 11–13
clustering, 11–13
Descriptive modeling, 78
decision strategy in, 92–94
Design of analytics solution, 158–164
analytics datamart, 161
analytics variables, 159–161
base variables, 159–160
decision variables, 161
model characteristics, 160–161
performance variables, 160
audit and control, 163
data warehouse extension, 158–159
decision strategy, 161–162
operational integration, 162–163
input data, 163
output decision, 163
strategy firing event, 162–163
Directed analytics, 13
Discrete variables, 70–72
Distinct values count, 155


Energy and utilities, 54–57
new power management challenge, 55–57
analytics solution for, 56–57
Enigma machine, 129–130
ETL (extract, transform, and load), 96, 115, 164, 170–171
design, development, and execution, 122–123
implementation, 164
process, 71–72
to rescue, 96–97
Execution and monitoring, 165
Experian, 38, 164
Expert business rules, 87–88
Expert controls, 107–108


FICO Score, 37–38, 164
“Flash Crash,”, 107
Fooled By Randomness, 86
Forecasting, 9–11
versus prediction, 14
Fraud detection, 57–58
analytics solution
for benefits fraud, 57–58
for credit card fraud, 58
Frequency, 155


Geographic information systems (GISs), 6–7
Geo-spatial analysis, 33, 34f
Gini Coefficient, 80
Governance, See Monitoring and tuning
Grain, 69


Hadoop, 189–193
as analytical engine, 193
as ETL engine, 192–193
solution architecture, 190–193
technology stack, 189–190
data access, 190
data processing, 190
data sources, 189–190
data store, 190
user applications, 190
Hadoop file system (HDFS), 189–190
Healthcare, 42–44
analytic solution
for emergency room visit, 42–43
for patients, with same disease, 43–44
Higher education, 51–52
analytics solution, for admission and acceptance, 52
Historical (snapshot) reporting, 30–31
Human resource, 46–48
analytic solution
for new employee resignation, 46–47
for resumé matching, 47–48
skilled, 24
Hype, for analytics, 3–4


Information Continuum, xix, 21, 41
applying for Big Data, 188
building blocks of, 22–25
innovation and need, 25
skilled human resources, 24
theoretical foundation, in data sciences, 23–24
tools, techniques, and technology, 24
levels in, 25–40, 26f
analytical applications, 33–35
analytics models, 35–36
counts and lists, 27–28
decision strategies, 36–38
metrics, KPIs, and thresholds, 31–33
monitoring and tuning, 38–40
operational reporting, 28–29
search and lookup, 26–27
snapshot reporting, 30–31
summary reporting, 29–30
Inmon, Bill, 113–114
Innovation, xviii, 197
and need, 25
Input variables, 63, 64f, 64t
Insurance, 49–51
analytics solution, for probability of claim, 49–51
Insurance claims, decision strategy in, 91–92
Intelligent systems and decision automation, 94–97
IT department
analytics projects, 117
data warehousing in, 113–117
data management, 115–117
efficient data acquisition, 115
evangelize, 114–115
holistic view, 115
quick results, 114
simplification, 113–114


Job scheduling and error handling, 124
Joint business and analytics sessions, 89


K-means, 11, 11, 12f
Kolmogorov–Smirnov Test (KS-Test), 80


Learning versus applying, 94–96


Manpower and skill, analytics’ impact on, 60
Manufacturing, 52–54
analytics solution
for analyzing warranty claims, 54
for predicting warranty claims, 53–54
MapReduce command interface, 190
Maximum length, 155
Maximum possible value, 154
Mean, 154
Median, 154
Metadata management and data governance, 123–124
Methodology, 151–165
analysis, 153–158
audit and control, 158
data profiling, 153–156
model and decision strategy, 156–157
operational integration, 157–158
problem statement and goal, 153
semantic profiling, 156
syntactic profiling, 154–156
deployment, 165
design, 158–164
analytics datamart, 161
analytics variables, 159–161
audit and control, 163
base variables, 159–160
data warehouse extension, 158–159
decision strategy, 161–162
decision variables, 161
model characteristics, 160–161
operational integration, 162–163
performance variables, 160
execution and monitoring, 165
implementation, 164
requirements, 152
Metrics, KPIs, and thresholds, 31–33
challenges, 33
implementation, 32–33
Minimum possible value, 154
Mode, 155
Model and decision strategy, 142–143, 156–157
Model development, 75–82
descriptive modeling
model and characteristics in, 78
predictive modeling
model and characteristics in, 75–78
validation and tuning, 78–82
predictive model validation, 79–82
Model execution, audit and control, 183
Model training, 8
Model validation and tuning, 78–82
Monitoring and tuning, 38–40, 101
analytics and automated decisions, 101–103
monitoring layer, 102–103
risk of, 102
audit and control framework, 103–109
audit datamart, 104–106
control definition, 106–108
organization and process, 103–104
reporting and action, 108–109
challenges, 40
implementation, 40
Monitoring layer, 102–103


New employee resignation, analytic solution for, 46–47
Nominal values, 72
Null, 155, 157


Operational integration, 144, 162–163
analysis of, 157–158
design of, 162–163
input data, 163
output decision, 163
strategy firing event, 162–163
Operational reporting, 28–29
challenges, 29
implementation, 29
Ordinal values, 72
Organizational structure, for analytics, 167–176
business intelligence competency center (BICC), 168–170
roles and responsibilities, 170–175
analytics, 172–174
business analysts, 172
data architecture, 171–172
ETL (extract, load), transform, 170–171
information delivery, 174–175
technical architecture analytics solutions, 176–183
analytics datamart, 176–183
base analytics data, 177–180
model and characteristics, 182–183
model execution, audit and control, 183
performance variables, 180–182
Outliers, 13


Performance variables, 63–75, 64t, 160, 180–182
aggregate variables, 182
benefits of, 68–69
creating, 69–70
grain, 69
range, 69–70
reasons for, 65–67
spread, 70
designing, 70–73
atomic versus aggregate, 72–73
discrete versus continuous, 70–72
nominal versus ordinal, 72
examples for, 68t
reporting variables, 181
third-party variables, 181–182
working example, 73–75
Power management challenge, 55–57
analytics solution for, 56–57
Predictive analytics, 13–18
data mining/machine learning, 15–17
methods, 14–18
prediction versus forecasting, 14
regression, 15
text mining, 17–18
Predictive modeling, 75–78
decision strategy in, 93f
validation, 79–82
parallel run, 80–81
retrospective processing, 81–82
Problem statement and goal analysis, 153
Problems, patterns of, 58–60
performance/derived variables, 59
size of data, 59
Process automation, 132
Profiling and data, See Data profiling
Propensity model, 45–46


Quantitative business rules, 88


Regression, 15
Reporting variables, 181
Reprocessing, 97–98
Requirements, 129, 152
historical perspective, 129–134
analytical and reporting systems, 132–133
analytics and decision strategy, 133
calculations, 130–132
process automation, 132
purpose of, 129
Requirements gathering/extraction, 134–145
business process integration requirements, 144–145
data requirements, 139–142
model and decision strategy requirements, 142–143
problem statement and goal, 135–139
Resumé matching, 47–48
analytic solution for, 47–48
Retail bank, decision strategy in, 89–91
Retrospective processing method, 97


Scorecard, 76
Search and lookup, 26–27
challenges, 27
implementation, 27
Semantic profiling, 156
Sequence analysis, 9–10
Simplification, xvii, 195–196
analytics techniques, demystifying, 195–196
implementation details, 196
simplified definition, 195
Skilled human resources, 24
Snapshot reporting, 30–31
challenges, 31
implementation, 31
Source system analysis, 122
Standard deviation, 155
Strategy evaluation, 97–98
reprocessing, 97–98
retrospective processing, 97
Strategy integration methods, 96–97
Summary reporting, 29–30
challenges, 30
implementation, 30
Supply chain, 52–53
Syntactic profiling, 154–156


Taleb, Nassim Nicholas, 86
Technical implementation perspective, 6–7
Telecommunication, 51
analytics solution, for usage patterns, 51
Text mining, 17–18, 19f, 47–48
Theoretical foundation, in data sciences, 24
Third-party variables, 181–182
Thresholds, designing, 106
Time series analysis, 9–11
Time series field, 155


Ultra machine, 129–130
Undirected analytics, 11


Value range, 70
Variables, 159–161
base variables, 159–160
decision variables, 161
model characteristics, 160–161
performance variables, 160
terminology for, 64t
Variety, in Big Data, 187
Velocity, of Big Data, 186–187
Volume, of Big Data, 187


Warranty claims
analyzing, 54
predicting, 53–54
Weather forecasting, 10–11, 56


Zeros, data profiling, 155, 157
