Index

Note: Page numbers followed by “f” and “t” indicate figures and tables respectively.

A
Access analytics
argparse module, 109
csv module, 109–110
datetime module, 110
haversine distance, 116–117
“Havesine Python,” 117
Linux/Unix systems, 110
math module, 110
MaxMind GeoIP API, 116
MaxMind’s GeoIP module, 121
parse_args() function, 112
parser.add.argument method, 112
pseudocode, 116
Python, 100, 103
Codecademy, 103–104
resources, 103
Web site, 104
re module, 109
remote access Python analytics program flow, 111, 111f
result analysis
connections types, 121
haversine distance, 118–119
malicious remote connections identification, 121
User8 access behavior, 119, 119f
User90 access behavior, 119, 120f
User91 access behavior, 120, 120f
vpn.csv file, output, 117, 118f
scripting language, 102
third-party remote access, 100
unauthorized access, 100
unauthorized remote access identification
anomalous user connections, 105–107
credit card transaction statements, 105
data collection, 105, 106f
data processing, 108–109
Haversine distances, 107–108
VPN
add-on two-factor authentication mechanisms, 101
CONNECT variable, 115
Event class, 114–115
logs, 112–113
monitoring, 101–102
normalize() function, 113–114
public network, 101
“RawMessage” column, 114
“ReceiveTime” column, 114
tunneling protocols, 100
unsecured/untrusted network, 100
Aggregate function, 136
Amazon’s Elastic MapReduce environment, 29
Analytical software and tools
Arena, See Arena
big data, 15–16
GUI, 13
Python, 19–20
R language, See R language
statistical programming, 14–15
Analytics
access analytics, See Access analytics
authentication, 9
big data, 5–6
computer systems and networks, 4–5
expert system program, 10
free-form text data, 8
incident response, 7
intrusion detection, 7
knowledge engineering, 4, 10
Known Unknowns, 8
log files, 5
logical access controls, 9
machine learning, 2
multiple large data centers, 5
security breaches and attacks, 1
security processes, 8–9
simulation-based decisions, 9
simulations, 4, 8–9
statistical techniques, 2
supervised learning, 2–3
text mining, 4
unauthorized access attempts, 10
Unknown Unknowns, 8
unsupervised learning, 3–4
virus/malware infection, 9
VPN access, 10
vulnerability management, 11–12
ApacheLogData files, 27
Apache Mahout, 14
Arena
adding data and parameters, 21, 69
conceptual model creation, 21, 68
flowchart modules, 21
IT service desk ticket queue, 68, 68f
Microsoft Visio, 68
Model window flowchart view, 20, 67
Model window spreadsheet view, 20, 68
Project bar, 20, 67
Rockwell Automation, 20, 67
running the simulation, 21, 69
simulation analysis, 22, 69
three-process scenario, 68
argparse module, 109
Artificial intelligence, 6, 14
as.Date function, 134
B
Bash shell command line, 27
Behavioral analysis, 5
Big data, 15–16, 149–150
artificial intelligence applications, 6
behavioral analysis, 5
CentOS desktop, 15–16
Cloudera QuickStart VM, 15
conducting analysis, 25
Hadoop technologies, 6, 15
Linux operating system, 15
MapReduce technologies, 6, 15
predictive analysis, 5
sudo command, 16
tools and analysis methods, 64
Unix commands, 15–16
C
CentOS desktop, 15–16
Classification techniques, 3
Cloudera Hadoop installation, 30
Cloudera QuickStart VM, 15
Cluster analysis
dendogram, 143, 144f
dist function, 143
dtmWithClust data frame object, 145
hclust function, 143
hierarchical clustering, 142
kmClust object, 145
k-means clustering, 142–143
kmeans function, 144–145
plot function, 143
print function, 144
randomForest function, 146
Clustering, 3
Comma separated values (CSV) module, 109–110
Comprehensive R Archive Network (CRAN), 16, 124
CONCAT() function, 45
Conduct data analysis, 154
Correlation analysis
access attacks, 137–138
assignment operator, 137
corData variable, 137
cor function, 137
correlation plot, 138, 139f
corrplot function, 138
png function, 138
rownames function, 137
SQL injection, 138–139
corrplot function, 138
CREATE module
external e-mail entities, 74, 74f
insertion, 73, 73f
properties updation, 74, 75f
D
DateOccurred column, 126
datetime module, 110
DECIDE module, 86, 87f
properties updation, 88, 88f
DECISION module, 92
Denial of service attack (DoS), 37
Descriptive statistics, 14
DISPOSE module, 78, 88
dist function, 143
Document-term matrix, 129–130
DocumentTermMatrix function, 140
E
Explanatory analysis, 153
F
findFreqTerms function, 131–132
G
Graphical user interface (GUI), 13
H
Hadoop File System (HDFS), 40
Hadoop technologies, 6, 15, 23
“Havesine Python,” 117
hclust function, 143
Hierarchical clustering, 142
Hive software stack, 23
I
IncidentDescription column, 126
Incident response, 7
big data tools and analysis methods, 64
commercial tools, 24
data breach, 23
data loading
ad hoc query, 41
Amazon’s AWS environment, 27
Amazon’s Elastic MapReduce environment, 29
ApacheLogData files, 27
Apache log-file format, 28
Bash shell command line, 27
bot activity, 43–45
Cloudera Hadoop installation, 30
command injection, 36–37
cross-site request forgery, 35
deserializer, 27
directory traversal and file inclusion, 32–34
failed access attempts, 42
“failedaccess” variable, 58
failed requests percentage, 41
failed requests per day/per month, 47–48
failed to successful requests ratio, time series, See Time series
“404 file not found,” 42–43
HDFS, 40
Hive code, 57
logistic regression coefficients, 59, 59f
Mahout command, 58
monthly time series, failed requests, 48–49
MySQL charset switch and MS-SQL DoS attack, 37–39
S3 bucket, 29
specific attack vectors, 30
spreadsheet program, 59
SQL injection attack, See SQL injection attack
“statusgroupings” view, 56–57
SUBSTR() function, 39–40
tallying and tracking failed request statuses, 39
time aggregations, 45–47
e-mail messages, 64
Hadoop software, 23
Hive software stack, 23
in intrusions and incident identification
big data tools, conducting analysis, 25
network and server traffic, 25
real-time intrusion detection and prevention, 24
unknown-unknowns, 24
log files, See Log files
MapReduce software, 23
open-source tools, 23–24
SQL-like syntax, 23
text mining techniques, 64
unstacked status codes, 59–63
inspect function, 128
Intrusion detection, 7
J
jitter function, 141
JOIN statement, 49–50
K
k-means clustering, 142–143
kmeans function, 144–145
Knowledge engineering, 4, 10
L
LIKE operator, 31
Linear regression, 3
Linux operating system, 15
Linux/Unix systems, 110
list function, 136, 140
lm function, 141
Log files, 5
access_log_7 file, 27
combined log file fields, 26
common log file fields, 26
methods, 26
open-source server software, 25–26
parsing, 64
server logs, 25–26
SQL-like analysis, 27
Logical access controls, 9
LOWER() function, 31
M
Machine learning, 2
Mahout command, 58
MapReduce technologies, 6, 15, 23
math module, 110
MaxMind GeoIP API, 116
MaxMind’s GeoIP module, 121
Metadata, 128
MS-SQL DoS attack, 37–39
myMethod method, 19
MySQL charset switch, 37–39
P
parse_args() function, 112
parser.add.argument method, 112
plot function, 141
png function, 141
Predictive analysis, 5, 153
Principal components analysis, 4
PROCESS module, 75, 76f
ACTION, 76
“Delay,” 76
properties updation, 76, 77f
resource property updation, 77–78, 78f
resources dialog box, 77
standard deviation, 85, 86f
Python, 19–20, 100, 103
Codecademy, 103–104
resources, 103
Web site, 104
R
randomForest function, 146
RECORD modules, 89, 89f
properties updation, 89, 90f
re module, 109
removeNumbers function, 128
removePunctuation function, 128
removewords function, 128
Risk management, 157–158
R language, 14
aggregate function, 136
arithmetic operators, 18
arrow operator, 18–19
as.Date function, 134
assignment operators, 18
cluster analysis
dendogram, 143, 144f
dist function, 143
dtmWithClust data frame object, 145
hclust function, 143
hierarchical clustering, 142
kmClust object, 145
k-means clustering, 142–143
kmeans function, 144–145
plot function, 143
print function, 144
randomForest function, 146
column headings, 126
CRAN, 16
cross site scripting reports, 136
data.frame function, 135
data profiling with summary statistics, 130–131
data types, 17
DateOccurred column, 126
document-term matrix, 129–130
findFreqTerms function, 131–132
functions, 17, 19
IncidentDescription column, 126
inspect function, 128
linear model function, 19
list function, 136
logical operators, 18–19
Massive Open Online Courses, 17
metadata, 128
myMethod method, 19
package libraries and data import, 127
by parameter, 136
R command line, 19
removeNumbers function, 128
removePunctuation function, 128
removewords function, 128
removing sparse terms, 130
statistical calculations, 16
stemDocument function, 127–128
stopWords function, 128
stripWhitespace function, 127
term matrix transpose, 133–134
terms dictionary
dictionary parameter, 140
DocumentTermMatrix function, 140
jitter function, 141
list function, 140
lm function, 141
plot function, 141
png function, 141
scatterplot graph, 140–141
Web and site, 141, 142f
time series trends, correlation analysis
access attacks, 137–138
assignment operator, 137
corData variable, 137
cor function, 137
correlation plot, 138, 139f
corrplot function, 138
png function, 138
rownames function, 137
SQL injection, 138–139
tm_map function, 127
toLower function, 128
Web Application Security Consortium, 125
WHID, 125
word associations, 132–133
Rockwell Automation, 67
rownames function, 137
S
Scripting language, 102
Security analytics process, 12, 12f, 151, 152f
Security intelligence
business extension, 154
data normalization, 158–159
decision-making, 151–152
equipment and personnel integration, 159–160
explanatory analysis, 153
false positives, 160
insider threat, 155–156
internal security gaps, 153
open-source technologies, 160–161
options, 152
predictive analysis, 153
raw data, 151–152
resource justification, 156–157
“right” data, 158–159
risk management, 157–158
security analytics process, 151, 152f
security breaches, 154
smoking gun, 152
warning sensors, 153
Security policy templates, 154
SELECT statement, 30
Simulation, 4, 8–9
additional report information, 91, 91f
Arena, See Arena
average processing times, 94t
batch run, 81, 83f
components, 73
conditional elements, 87
Connect button, 77–78, 78, 79f–80f
constant delay type, 85, 95t
CREATE module
external e-mail entities, 74, 74f
insertion, 73, 73f
properties updation, 74, 75f
data used, 95t–98t
DECIDE module, 86, 87f
properties updation, 88, 88f
DECISION module, 92
DISPOSE module, 78, 88
efficacy, 92, 93f
e-mail gateway device, 69
final report view, 93, 94f
final results, 95t
normal delay type, 85
normal distribution, 95t
parameters, 79, 81f
PROCESS dialog’s standard deviation, 85, 86f
PROCESS module, 75, 76f
ACTION, 76
“Delay,” 76
properties updation, 76, 77f
resource property updation, 77–78, 78f
resources dialog box, 77
Project Parameter tab, 79–80
Project set up, 80, 82f
RECORD modules, 89, 89f
properties updation, 89, 90f
report view, 89, 90f
running simulation, 81, 82f
standard deviation, 83, 84f
“True Clean” decision box, 91, 92f
vendor choice, 86, 87t
vendor processing time, 81–83, 84t
vendor scenario data, 69, 70t–72t
vendor scenario probability, 92, 92t
vendor scenario statistics, 84–85, 85t
vendor simulation average processing time, 72, 72t
Simulation-based decisions, 9
SQL injection attack
advantage, 32
LIKE operator, 31
LOWER() function, 31
output, 31
SELECT statement, 30
stemDocument function, 127–128
stopWords function, 128
stripWhitespace function, 127
SUBSTR() function, 39–40, 45
Supervised learning, 2–3
T
Term document matrix, 124
Text Mining
CRAN repository, 124
e-mails, 123
open source software tools, 123–124
security breaches, 147
semistructured data, 123
text mining techniques, See Text mining techniques
unstructured data, 123
Text mining techniques, 4, 64
big data, 149–150
common data transformations, 125
document-term matrix, 124
in R, See R language
security scenarios, 148–149
term document matrix, 124
Time series
autocorrelation effects, 55, 57f
code snippet, 54
control plot, 55, 56f
delimiters, 54
Hive output, 56
INSERT OVERWRITE LOCAL DIRECTORY command, 53–54
JOIN statement, 49–50
query, 52
server logs, 51
“yearmonthday” field, 50
tm_map function, 127
toLower function, 128
U
Unauthorized remote access identification
anomalous user connections, 105–107
credit card transaction statements, 105
data collection, 105, 106f
data processing, 108–109
Haversine distances, 107–108
Unsupervised learning, 3–4
V
Virtual private network (VPN), 10
add-on two-factor authentication mechanisms, 101
CONNECT variable, 115
Event class, 114–115
logs, 112–113
monitoring, 101–102
normalize() function, 113–114
public network, 101
“RawMessage” column, 114
“ReceiveTime” column, 114
tunneling protocols, 100
unsecured/untrusted network, 100
Vulnerability management, 11–12
W
Web Application Security Consortium, 125
Web Hacking Incident Database (WHID), 125
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset