Control statements allow us to control the flow of our programming and cause different things to happen depending on the values of tests. Tests result in a logical
, TRUE
, or FALSE
, which is used in if-like statements. The main control statements are if
, else
, ifelse
and switch
.
The most common test is the if
command. It essentially says: If something is TRUE
, then perform some action; otherwise, do not perform that action. The thing we are testing goes inside parentheses following the if
command. The most basic checks are equal to (==
), less than (<
), less than or equal to (<=
), greater than (>
), greater than or equal to (>=
) and not equal (!=
).
If these tests pass they result in TRUE
, and if they fail they result in FALSE
. As noted in Section 4.3.4, TRUE
is numerically equivalent to 1 and FALSE
is equivalent to 0.
> as.numeric(TRUE)
[1] 1
> as.numeric(FALSE)
[1] 0
These tests do not need to be used inside if
statements. The following are some simple examples.
> 1 == 1 # TRUE
[1] TRUE
> 1 < 1 # FALSE
[1] FALSE
> 1 <= 1 # TRUE
[1] TRUE
> 1 > 1 # FALSE
[1] FALSE
> 1 >= 1 # TRUE
[1] TRUE
> 1 != 1 # FALSE
[1] FALSE
We can now show that using this test inside an if
statement controls actions that follow.
> # set up a variable to hold 1
> toCheck <- 1
>
> # if toCheck is equal to 1, print hello
> if (toCheck == 1)
+ {
+ print("hello")
+ }
[1] "hello"
>
> # now if toCheck is equal to 0, print hello
> if (toCheck == 0)
+ {
+ print("hello")
+ }
> # notice nothing was printed
Notice that if
statements are similar to functions in that all statements (there can be one or multiple) go inside curly braces.
Life is not always so simple that we want an action only if some relationship is TRUE
. We often want a different action if that relationship is FALSE
. In the following example we put an if
statement followed by an else
statement inside a function, so that it can be used repeatedly.
> # first create the function
> check.bool <- function(x)
+ {
+ if (x == 1)
+ {
+ # if the input is equal to 1, print hello
+ print("hello")
+ } else
+ {
+ # otherwise print goodbye
+ print("goodbye")
+ }
+ }
Notice that else
is on the same line as its preceding closing curly brace (}). This is important, as the code will fail otherwise.
Now let’s use that function and see if it works.
> check.bool(1)
[1] "hello"
> check.bool(0)
[1] "goodbye"
> check.bool("k")
[1] "goodbye"
> check.bool(TRUE)
[1] "hello"
Anything other than 1 caused the function to print “goodbye.” That is exactly what we wanted. Passing TRUE
printed “hello” because TRUE
is numerically the same as 1.
Perhaps we want to successively test a few cases. That is where we can use else if
. We first test a single statement, then make another test, and then perhaps fall over to catch all. We will modify check.bool
to test for one condition and then another.
> check.bool <- function(x)
+ {
+ if (x == 1)
+ {
+ # if the input is equal to 1, print hello
+ print("hello")
+ } else if (x == 0)
+ {
+ # if the input is equal to 0, print goodbye
+ print("goodbye")
+ } else
+ {
+ # otherwise print confused
+ print("confused")
+ }
+ }
>
> check.bool(1)
[1] "hello"
> check.bool(0)
[1] "goodbye"
> check.bool(2)
[1] "confused"
> check.bool("k")
[1] "confused"
If we have multiple cases to check, writing else if
repeatedly can be cumbersome and inefficient. This is where switch
is most useful. The first argument is the value we are testing. Subsequent arguments are a particular value and what should be the result. The last argument, if not given a value, is the default result.
To illustrate, we build a function that takes in a value and returns a corresponding result.
> use.switch <- function(x)
+ {
+ switch(x,
+ "a"="first",
+ "b"="second",
+ "z"="last",
+ "c"="third",
+ "other")
+ }
>
> use.switch("a")
[1] "first"
> use.switch("b")
[1] "second"
> use.switch("c")
[1] "third"
> use.switch("d")
[1] "other"
> use.switch("e")
[1] "other"
> use.switch("z")
[1] "last"
If the first argument is numeric, it is matched positionally to the following arguments, regardless of the names of the subsequent arguments. If the numeric argument is greater than the number of subsequent arguments, NULL
is returned.
> use.switch(1)
[1] "first"
> use.switch(2)
[1] "second"
> use.switch(3)
[1] "last"
> use.switch(4)
[1] "third"
> use.switch(5)
[1] "other"
> use.switch(6) # nothing is returned
> is.null(use.switch(6))
[1] TRUE
Here we introduced a new function, is.null
, which, as the name implies, tests if an object is NULL
.
While if
is like the if
statement in traditional languages, ifelse
is more like the if
function in Excel. The first argument is the condition to be tested (much like in a traditional if
statement), the second argument is the return value if the test is TRUE
and the third argument is the return value if the test if FALSE
. The beauty here—unlike with the traditional if
—is that this works with vectorized arguments. As is often the case in R
, using vectorization avoids for
loops and speeds up our code. The nuances of ifelse
can be tricky, so we show numerous examples.
We start with a very simple example, testing if 1 is equal to 1, and printing “Yes” if that is TRUE
and “No” if it is FALSE
.
> # see if 1 == 1
> ifelse(1 == 1, "Yes", "No")
[1] "Yes"
> # see if 1 == 0
> ifelse(1 == 0, "Yes", "No")
[1] "No"
This clearly gives us the results we want. ifelse
uses all the regular equality tests seen in Section 9.1 and any other logical
test. It is worth noting, however, that if testing just a single element (a vector
of length 1 or a simple is.na
) it is more efficient to use if
than ifelse
. This can result in a nontrivial speedup of our code.
Next we will illustrate a vectorized first argument.
> toTest <- c(1, 1, 0, 1, 0, 1)
> ifelse(toTest == 1, "Yes", "No")
[1] "Yes" "Yes" "No" "Yes" "No" "Yes"
This returned “Yes” for each element of toTest
that equaled 1 and “No” for each element of toTest
that did not equal 1.
The TRUE
and FALSE
arguments can even refer to the testing element.
> ifelse(toTest == 1, toTest * 3, toTest)
[1] 3 3 0 3 0 3
> # the FALSE argument is repeated as needed
> ifelse(toTest == 1, toTest * 3, "Zero")
[1] "3" "3" "Zero" "3" "Zero" "3"
Now let’s say that toTest
has NA
elements. In that case the corresponding result from ifelse
is NA
.
> toTest[2] <- NA
> ifelse(toTest == 1, "Yes", "No")
[1] "Yes" NA "No" "Yes" "No" "Yes"
This would be the same if the TRUE
and FALSE
arguments are vector
s.
> ifelse(toTest == 1, toTest * 3, toTest)
[1] 3 NA 0 3 0 3
> ifelse(toTest == 1, toTest * 3, "Zero")
[1] "3" NA "Zero" "3" "Zero" "3"
The statement being tested with if
, ifelse
and switch
can be any argument that results in a logical TRUE
or FALSE
. This can be an equality check or even the result of is.numeric
or is.na
. Sometimes we want to test more than one relationship at a time. This is done using logical and
and or
operators. These are &
and &&
for and
and |
and ||
for or
. The differences are subtle but can impact our code’s speed.
The double form (&&
or ||
) is best used in if
and the single form (&
or |
) is necessary for ifelse
. The double form compares only one element from each side, while the single form compares each element of each side.
> a <- c(1, 1, 0, 1)
> b <- c(2, 1, 0, 1)
>
> # this checks each element of a and each element of b
> ifelse(a == 1 & b == 1, "Yes", "No")
[1] "No" "Yes" "No" "Yes"
>
> # this only checks the first element of a and the first element of b,
> # returning only one result
> ifelse(a == 1 && b == 1, "Yes", "No")
[1] "No"
Another difference between the double and single forms is how they are processed. When using the single form, both sides of the operator are always checked. With the double form, sometimes only the left side needs to be checked. For instance, if testing 1 == 0 && 2 == 2
, the left side fails, so there is no reason to check the right side. Similarly, when testing 3 == 3 || 0 == 0
, the left side passes, so there is no need to check the right side. This can be particularly helpful when the right side would throw an error if the left side had failed.
There can be more than just two conditions tested. Many conditions can be strung together using multiple and
or or
operators. The different clauses can be grouped by parentheses just like mathematical operations. Without parentheses, the order of operations is similar to PEMDAS, seen in Section 4.1, where and
is equivalent to multiplication and or
is equivalent to addition, so and
takes precedence over or
.
Controlling the flow of our program, both at the command line and in functions, plays an important role when processing and analyzing our data. if
statements, along with else
, are the most common—and efficient—for testing single element objects, although ifelse
is far more common in R
programming because of its vectorized nature. switch
statements are often forgotten but can come in very handy. The and
(&
and &&
) and or
(|
and ||
) operators allow us to combine multiple tests into one.