Chapter 9. Control Statements

Control statements allow us to control the flow of our programming and cause different things to happen depending on the values of tests. Tests result in a logical, TRUE, or FALSE, which is used in if-like statements. The main control statements are if, else, ifelse and switch.

9.1. if and else

The most common test is the if command. It essentially says: If something is TRUE, then perform some action; otherwise, do not perform that action. The thing we are testing goes inside parentheses following the if command. The most basic checks are equal to (==), less than (<), less than or equal to (<=), greater than (>), greater than or equal to (>=) and not equal (!=).

If these tests pass they result in TRUE, and if they fail they result in FALSE. As noted in Section 4.3.4, TRUE is numerically equivalent to 1 and FALSE is equivalent to 0.

> as.numeric(TRUE)

[1] 1

> as.numeric(FALSE)

[1] 0

These tests do not need to be used inside if statements. The following are some simple examples.

> 1 == 1 # TRUE

[1] TRUE

> 1 < 1 # FALSE

[1] FALSE

> 1 <= 1 # TRUE

[1] TRUE

> 1 > 1 # FALSE

[1] FALSE

> 1 >= 1 # TRUE

[1] TRUE

> 1 != 1 # FALSE

[1] FALSE

We can now show that using this test inside an if statement controls actions that follow.

> # set up a variable to hold 1
> toCheck <- 1
>
> # if toCheck is equal to 1, print hello
> if (toCheck == 1)
+ {
+     print("hello")
+ }

[1] "hello"

>
> # now if toCheck is equal to 0, print hello
> if (toCheck == 0)
+ {
+     print("hello")
+ }
> # notice nothing was printed

Notice that if statements are similar to functions in that all statements (there can be one or multiple) go inside curly braces.

Life is not always so simple that we want an action only if some relationship is TRUE. We often want a different action if that relationship is FALSE. In the following example we put an if statement followed by an else statement inside a function, so that it can be used repeatedly.

> # first create the function
> check.bool <- function(x)
+ {
+     if (x == 1)
+     {
+         # if the input is equal to 1, print hello
+         print("hello")
+     } else
+     {
+         # otherwise print goodbye
+         print("goodbye")
+     }
+ }

Notice that else is on the same line as its preceding closing curly brace (}). This is important, as the code will fail otherwise.

Now let’s use that function and see if it works.

> check.bool(1)

[1] "hello"

> check.bool(0)

[1] "goodbye"

> check.bool("k")

[1] "goodbye"

> check.bool(TRUE)

[1] "hello"

Anything other than 1 caused the function to print “goodbye.” That is exactly what we wanted. Passing TRUE printed “hello” because TRUE is numerically the same as 1.

Perhaps we want to successively test a few cases. That is where we can use else if. We first test a single statement, then make another test, and then perhaps fall over to catch all. We will modify check.bool to test for one condition and then another.

> check.bool <- function(x)
+ {
+     if (x == 1)
+     {
+         # if the input is equal to 1, print hello
+         print("hello")
+     } else if (x == 0)
+     {
+         # if the input is equal to 0, print goodbye
+         print("goodbye")
+     } else
+     {
+         # otherwise print confused
+         print("confused")
+     }
+ }
>
> check.bool(1)

[1] "hello"

> check.bool(0)

[1] "goodbye"

> check.bool(2)

[1] "confused"

> check.bool("k")

[1] "confused"

9.2. switch

If we have multiple cases to check, writing else if repeatedly can be cumbersome and inefficient. This is where switch is most useful. The first argument is the value we are testing. Subsequent arguments are a particular value and what should be the result. The last argument, if not given a value, is the default result.

To illustrate, we build a function that takes in a value and returns a corresponding result.

> use.switch <- function(x)
+ {
+     switch(x,
+         "a"="first",
+         "b"="second",
+         "z"="last",
+         "c"="third",
+         "other")
+ }
>
> use.switch("a")

[1] "first"

> use.switch("b")

[1] "second"

> use.switch("c")

[1] "third"

> use.switch("d")

[1] "other"

> use.switch("e")

[1] "other"

> use.switch("z")

[1] "last"

If the first argument is numeric, it is matched positionally to the following arguments, regardless of the names of the subsequent arguments. If the numeric argument is greater than the number of subsequent arguments, NULL is returned.

> use.switch(1)

[1] "first"

> use.switch(2)

[1] "second"

> use.switch(3)

[1] "last"

> use.switch(4)

[1] "third"

> use.switch(5)

[1] "other"

> use.switch(6) # nothing is returned
> is.null(use.switch(6))

[1] TRUE

Here we introduced a new function, is.null, which, as the name implies, tests if an object is NULL.

9.3. ifelse

While if is like the if statement in traditional languages, ifelse is more like the if function in Excel. The first argument is the condition to be tested (much like in a traditional if statement), the second argument is the return value if the test is TRUE and the third argument is the return value if the test if FALSE. The beauty here—unlike with the traditional if—is that this works with vectorized arguments. As is often the case in R, using vectorization avoids for loops and speeds up our code. The nuances of ifelse can be tricky, so we show numerous examples.

We start with a very simple example, testing if 1 is equal to 1, and printing “Yes” if that is TRUE and “No” if it is FALSE.

> # see if 1 == 1
> ifelse(1 == 1, "Yes", "No")

[1] "Yes"

> # see if 1 == 0
> ifelse(1 == 0, "Yes", "No")

[1] "No"

This clearly gives us the results we want. ifelse uses all the regular equality tests seen in Section 9.1 and any other logical test. It is worth noting, however, that if testing just a single element (a vector of length 1 or a simple is.na) it is more efficient to use if than ifelse. This can result in a nontrivial speedup of our code.

Next we will illustrate a vectorized first argument.

> toTest <- c(1, 1, 0, 1, 0, 1)
> ifelse(toTest == 1, "Yes", "No")

[1] "Yes" "Yes" "No"  "Yes" "No"  "Yes"

This returned “Yes” for each element of toTest that equaled 1 and “No” for each element of toTest that did not equal 1.

The TRUE and FALSE arguments can even refer to the testing element.

> ifelse(toTest == 1, toTest * 3, toTest)

[1] 3 3 0 3 0 3

> # the FALSE argument is repeated as needed
> ifelse(toTest == 1, toTest * 3, "Zero")

[1] "3"    "3"    "Zero" "3"    "Zero" "3"

Now let’s say that toTest has NA elements. In that case the corresponding result from ifelse is NA.

> toTest[2] <- NA
> ifelse(toTest == 1, "Yes", "No")

[1] "Yes" NA    "No"  "Yes" "No"  "Yes"

This would be the same if the TRUE and FALSE arguments are vectors.

> ifelse(toTest == 1, toTest * 3, toTest)

[1]  3 NA  0  3  0  3

> ifelse(toTest == 1, toTest * 3, "Zero")

[1] "3"   NA      "Zero" "3"    "Zero" "3"

9.4. Compound Tests

The statement being tested with if, ifelse and switch can be any argument that results in a logical TRUE or FALSE. This can be an equality check or even the result of is.numeric or is.na. Sometimes we want to test more than one relationship at a time. This is done using logical and and or operators. These are & and && for and and | and || for or. The differences are subtle but can impact our code’s speed.

The double form (&& or ||) is best used in if and the single form (& or |) is necessary for ifelse. The double form compares only one element from each side, while the single form compares each element of each side.

> a <- c(1, 1, 0, 1)
> b <- c(2, 1, 0, 1)
>
> # this checks each element of a and each element of b
> ifelse(a == 1 & b == 1, "Yes", "No")

[1] "No"  "Yes" "No"  "Yes"

>
> # this only checks the first element of a and the first element of b,
> # returning only one result
> ifelse(a == 1 && b == 1, "Yes", "No")

[1] "No"

Another difference between the double and single forms is how they are processed. When using the single form, both sides of the operator are always checked. With the double form, sometimes only the left side needs to be checked. For instance, if testing 1 == 0 && 2 == 2, the left side fails, so there is no reason to check the right side. Similarly, when testing 3 == 3 || 0 == 0, the left side passes, so there is no need to check the right side. This can be particularly helpful when the right side would throw an error if the left side had failed.

There can be more than just two conditions tested. Many conditions can be strung together using multiple and or or operators. The different clauses can be grouped by parentheses just like mathematical operations. Without parentheses, the order of operations is similar to PEMDAS, seen in Section 4.1, where and is equivalent to multiplication and or is equivalent to addition, so and takes precedence over or.

9.5. Conclusion

Controlling the flow of our program, both at the command line and in functions, plays an important role when processing and analyzing our data. if statements, along with else, are the most common—and efficient—for testing single element objects, although ifelse is far more common in R programming because of its vectorized nature. switch statements are often forgotten but can come in very handy. The and (& and &&) and or (| and ||) operators allow us to combine multiple tests into one.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset