To begin our analysis, we will examine the summary statistics and correlations of our data. These will give us an overview of the data and inform our subsequent analyses:
summary(object):
> #generate a summary of the fire subset > summaryFire <- summary(subsetFire) > #display the summary > summaryFire
For a discussion on converting nonnumeric data, refer to the Quantifying Categorical Variables section of Chapter 4.
Method
column using as.numeric(data):
> #represent categorical data numerically using as.numeric(data) > #recode the Method column into Fire = 1 > numericMethodFire <- as.numeric(subsetFire$Method) - 1
SuccessfullyExecuted
column using as.numeric(data):
> #recode the SuccessfullyExecuted column into N = 0 and Y = 1 > numericExecutionFire <- as.numeric(subsetFire$SuccessfullyExecuted) - 1
Result
column using as.numeric(data):
> #recode the Result column into Defeat = 0 and Victory = 1 > numericResultFire <- as.numeric(subsetFire$Result) - 1
> #save the data in the numeric Method, SuccessfullyExecuted, and Result columns back into the fire attack dataset > subsetFire$Method <- numericMethodFire > subsetFire$SuccessfullyExecuted <- numericExecutionFire > subsetFire$Result <- numericResultFire
SuccessfullyExecuted
and Result
columns with numeric data, we can now calculate all of the correlations in the dataset using the cor(data)
function:> #use cor(data) to calculate all of the correlations in the fire attack dataset > cor(subsetFire)
Note that the error message and NA
values in our correlation output result from the fact that our Method column contains only a single value. This is irrelevant to our analysis and can be ignored.
Initially, we calculated summary statistics for our fire attack dataset using the summary(object)
function. From this information, we can derive the following useful insights about our past battles:
Next, we recoded the text values in our dataset's Method, SuccessfullyExecuted
, and Result
columns into numeric form. After adding the data from these variables back into our our original dataset, we were able to calculate all of its correlations. This allowed us to learn even more about our past battle data:
The insights gleaned from our summary statistics and correlations put us in a prime position to begin developing our regression model.
a. Calculation functions can be executed on the recoded variable.
b. Calculation functions can be executed on the other variables in the dataset.
c. Calculation functions can be executed on the entire dataset.
d. There is no benefit.