Although reading your data into R allows you to visualize it in the console and use it to make hand-typed calculations (as we did in Chapter 3), you generally need a more organized and flexible method for manipulating your data. R variables are well suited to accomplish this aim. Instead of only reading our resource data into R, let us this time read and store our data in a variable:
hanzhongResources:
> #read the data from hanzhongResources.csv into a variable named hanzhongResources > hanzhongResources <- read.csv("hanzhongResources.csv")
> #display the contents of the hanzhongResources variable > #Shu resources located in Hanzhong, China > hanzhongResources
You may have noticed that calling your hanzhongResources
variable yields the exact same output as reading the original CSV file into R. However, the variable is much more efficient, because we do not have to type the entire read.csv(dir)
code each time we want to display its data. Instead, we may simply refer to it by name.
You have created and called your first variable in R. Variables are essential for storing and manipulating data. Each time you create a variable in R, you will follow a similar process to the one we just exercised. The four steps in the variable creation process are as follows:
In our preceding example, hanzhongResources
was the name of our variable. A name should be the first thing that appears on a new console line when creating an R variable.
After the variable name, the less than minus symbol, or <-, should be added. The <- symbol can be thought of as meaning "is set equal to the contents of." These characters have the effect of assigning the information on their right to the variable name on their left. For example, the line > A <- B
can be read as "the variable named A is set equal to the contents of B." Therefore, in our previous example, we set the variable named hanzhongResources
equal to the contents of the file hanzhongResources.csv
.
The data source hanzhongResources.csv
was used in our example. A data source should be the last thing that appears on the console line when creating an R variable. Data sources typically take on the form of datasets that are read into R, numeric values, or previously created variables.
When executing a line of R code does not yield visible output, as is the case when creating a new variable, it is wise to verify the results of our actions. To display the contents of a variable, type its name in the R console and press the Return key. In our case, entering hanzhongResources
will yield a console output containing the Shu army's resources located in Hanzhong, China.
read.csv(file)
function as a variable?a. The variable name is more efficient to type.
b. The variable name is easier to remember.
c. The variable's data is preserved even if the original CSV file is moved or deleted.
d. The variable explicitly states its data source.
> myVariable <- myData
a. The variable myVariable
is set equal to the contents of myData
.
b. The variable myData
is set equal to the contents of myVariable
.
c. The variable myVariable
is less than negative myData
.
d. The variable myVariable
is greater than zero and less than negative myData
.
You are now familiar with the process behind creating a new data variable in R. The soldiersByCity.csv
file contains the total number of soldiers located in each major city within Shu and Wei territory. Copy this file into your R working directory. Then use the four step process to create and verify the contents of a new variable called soldiersByCity
. This variable should contain all of the data in the soldiersByCity.csv
file.