|
4GL. See fourth generation language
acronym resolution, 147
alternate spelling, 83, 147
alternate storage, 147
analog, 147
analog data, 22–24
analog data pond, 34, 147
value of, 119
application, 147
application data, 24–26
application data pond, 34, 147
value of, 121
archival data pond, 3, 31, 36, 37, 93, 127, 143, 144, 145
archival data pool, 148
archival database, 147
archival processing, 147
Big Data, 6, 7, 42, 102, 148
bit map indexing, 55
business process, 148
clustering, 52, 54
compression, 51
concept search, 113, 138
conditioning, 148
conditioning process, 32, 34, 37, 39, 44, 48, 61, 75, 91, 94
constraint, 148
context, 15–16
contextualization, 148
conversion, 50, 70, 104
corporate information factory, 128, 129, 131, 132, 135, 148
customer sentiment, 123
data lake, 148
“one way”, 1, 7–10, 102
difficulty getting value from, 1
explanation of, 6
potential of, 13
data model, 41, 67, 68, 69, 85, 95, 149
data pond, 2, 148
data reduction, 2, 34, 50, 52, 53, 54, 55, 80, 103
data scientist, 6, 17, 149
database, 148
database management system, 5, 25, 66, 149, 150
date standardization, 83
DBMS. See database management system
deduplication, 50
descriptor, 64
DNA, 63, 64, 74
document, 149
encoding, 51
ETL. See extract/transform/load
excision, 51, 53
extract/transform/load, 131
fourth generation language, 5, 147
garbage dump, 1, 8, 9, 17, 19, 108
general usability, 18
great divide, 28, 29, 149
Hadoop, 5, 149
homograph, 149
homographic resolution, 83, 149
independent index, 25
information gold mine, 13, 16, 17, 19
inline contextualization, 82, 149
integration, 67, 69, 99
integration mapping, 19
interpolation, 51
key performance indicator, 101
KPI. See key performance indicator
log tape, 24, 149
machine learning, 113, 138
magnetic tape, 5
measurement, 22, 23, 45, 47, 56, 58, 69
metadata, 5, 9, 19, 68, 149
metaprocess, 16–17, 19, 44
miscellaneous data, 99, 103, 135
mobile computing, 5
natural language processing, 107
NLP. See natural language processing
non-repetitive data, 28, 150
ontology, 85, 86
organization criteria, 41
outlier, 58
parsing, 150
pattern analysis, 150
pond descriptor, 39, 40
pond target, 39, 41, 42
pond transformation criteria, 39
proximity, 82
proximity analysis, 150
punch card, 5
raw data, 9
raw data pond, 31, 32, 33, 35, 36, 98, 99, 100, 103
repetitive data, 28
rounding, 51
sampling, 51
schema on read, 42, 43
search and qualify, 138, 139
sentiment analysis, 132
sentiment taxonomy, 88
silo, 14
smoothing, 51
statistical analysis, 140, 150
stop word, 150
storage, 5
structured data, 150
summarization criteria, 41
taxonomy, 85, 87, 88, 150
taxonomy resolution, 83
textual data, 26–27
textual data pond, 34, 77, 150
value of, 122
textual disambiguation, 27, 78, 80, 139, 150
threshholding, 52, 53
tokenization, 51
unintegrated, 14, 15, 34, 69, 106, 114
unstructured data, 26
visualization, 88, 90, 95, 113, 116, 137