Chapter 8. Flow Control and Loops

In R, as with other languages, there are many instances where we might want to conditionally execute code, or to repeatedly execute similar code.

The if and switch functions of R should be familiar if you have programmed in other languages, though the fact that they are functions may be new to you. Vectorized conditional execution via the ifelse function is also an R speciality.

We’ll look at all of these in this chapter, as well as the three simplest loops (for, while, and repeat), which again should be reasonably familiar from other languages. Due to the vectorized nature of R, and some more aesthetic alternatives, these loops are less commonly used in R than you may expect.

Chapter Goals

After reading this chapter, you should:

  • Be able to branch the flow of execution
  • Be able to repeatedly execute code with loops

Flow Control

There are many occasions where you don’t just want to execute one statement after another: you need to control the flow of execution. Typically this means that you only want to execute some code if a condition is fulfilled.

if and else

The simplest form of flow control is conditional execution using if. if takes a logical value (more precisely, a logical vector of length one) and executes the next statement only if that value is TRUE:

if(TRUE) message("It was true!")
## It was true!
if(FALSE) message("It wasn't true!")

Missing values aren’t allowed to be passed to if; doing so throws an error:

if(NA) message("Who knows if it was true?")
## Error: missing value where TRUE/FALSE needed

Where you may have a missing value, you should test for it using is.na:

if(is.na(NA)) message("The value is missing!")
## The value is missing!

Of course, most of the time, you won’t be passing the actual values TRUE or FALSE. Instead you’ll be passing a variable or expression—if you knew that the statement was going to be executed in advance, you wouldn’t need the if clause. In this next example, runif(1) generates one uniformly distributed random number between 0 and 1. If that value is more than 0.5, then the message is displayed:

if(runif(1) > 0.5) message("This message appears with a 50% chance.")

If you want to conditionally execute several statements, you can wrap them in curly braces:

x <- 3
if(x > 2)
{
  y <- 2 * x
  z <- 3 * y
}

For clarity of code, some style guides recommend always using curly braces, even if you only want to conditionally execute one statement.

The next step up in complexity from if is to include an else statement. Code that follows an else statement is executed if the if condition was FALSE:

if(FALSE)
{
  message("This won't execute...")
} else
{
  message("but this will.")
}
## but this will.

One important thing to remember is that the else statement must occur on the same line as the closing curly brace from the if clause. If you move it to the next line, you’ll get an error:

if(FALSE)
{
  message("This won't execute...")
}
else
{
  message("and you'll get an error before you reach this.")
}

Multiple conditions can be defined by combining if and else repeatedly. Notice that if and else remain two separate words—there is an ifelse function but it means something slightly different, as we’ll see in a moment:

(r <- round(rnorm(2), 1))
## [1] -0.1 -0.4
(x <- r[1] / r[2])
## [1] 0.25
if(is.nan(x))
{
  message("x is missing")
} else if(is.infinite(x))
{
  message("x is infinite")
} else if(x > 0)
{
  message("x is positive")
} else if(x < 0)
{
  message("x is negative")
} else
{
  message("x is zero")
}
## x is positive

R, unlike many languages, has a nifty trick that lets you reorder the code and do conditional assignment. In the next example, Re returns the real component of a complex number (Im returns the imaginary component):

x <- sqrt(-1 + 0i)
(reality <- if(Re(x) == 0) "real" else "imaginary")
## [1] "real"

Vectorized if

The standard if statement takes a single logical value. If you pass a logical vector with a length of more than one (don’t do this!), then R will warn you that you’ve given multiple options, and only the first one will be used:

if(c(TRUE, FALSE)) message("two choices")
## Warning: the condition has length > 1 and only the first element will be
## used
## two choices

Since much of R is vectorized, you may not be surprised to learn that it also has vectorized flow control, in the form of the ifelse function. ifelse takes three arguments. The first is a logical vector of conditions. The second contains values that are returned when the first vector is TRUE. The third contains values that are returned when the first vector is FALSE. In the following example, rbinom generates random numbers from a binomial distribution to simulate a coin flip:

ifelse(rbinom(10, 1, 0.5), "Head", "Tail")
##  [1] "Head" "Head" "Head" "Tail" "Tail" "Head" "Head" "Head" "Tail" "Head"

ifelse can also accept vectors in the second and third arguments. These should be the same size as the first vector (if the vectors aren’t the same size, then elements in the second and third arguments are recycled or ignored to make them the same size as the first):

(yn <- rep.int(c(TRUE, FALSE), 6))
##  [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE
## [12] FALSE
ifelse(yn, 1:3, -1:-12)
##  [1]   1  -2   3  -4   2  -6   1  -8   3 -10   2 -12

If there are missing values in the condition argument, then the corresponding values in the result will be missing:

yn[c(3, 6, 9, 12)] <- NA
ifelse(yn, 1:3, -1:-12)
##  [1]   1  -2  NA  -4   2  NA   1  -8  NA -10   2  NA

Multiple Selection

Code with many else statements can quickly become cumbersome to read. In such circumstances, prettier code can sometimes be achieved with a call to the switch function. The most common usage takes for its first argument an expression that returns a string, followed by several named arguments that provide results when the name matches the first argument. The names must match the first argument exactly (since R 2.11.0), and you can execute multiple expressions by enclosing them in curly braces:

(greek <- switch(
  "gamma",
  alpha = 1,
  beta  = sqrt(4),
  gamma =
  {
    a <- sin(pi / 3)
    4 * a ^ 2
  }
))
## [1] 3

If no names match, then switch (invisibly) returns NULL:

(greek <- switch(
  "delta",
  alpha = 1,
  beta  = sqrt(4),
  gamma =
  {
    a <- sin(pi / 3)
    4 * a ^ 2
  }
))
## NULL

For these circumstances, you can provide an unnamed argument that matches when nothing else does:

(greek <- switch(
  "delta",
  alpha = 1,
  beta  = sqrt(4),
  gamma =
  {
    a <- sin(pi / 3)
    4 * a ^ 2
  },
  4
))
## [1] 4

switch can also take a first argument that returns an integer. In this case the remaining arguments do not need names—the next argument is executed if the first argument resolves to 1, the argument after that is executed if the first argument resolves to 2, and so on:

switch(
  3,
  "first",
  "second",
  "third",
  "fourth"
)
## [1] "third"

As you may have noticed, no default argument is possible in this case. It’s also rather cumbersome if you want to test for large integers, since you’ll need to provide many arguments. Under those circumstances it is best to convert the first argument to a string and use the first syntax:

switch(
  as.character(2147483647),
  "2147483647" = "a big number",
  "another number"
)
## [1] "a big number"

Loops

There are three kinds of loops in R: repeat, while, and for. Although vectorization means that you don’t need them as much in R as in other languages, they can still come in handy for repeatedly executing code.

repeat Loops

The easiest loop to master in R is repeat. All it does is execute the same code over and over until you tell it to stop. In other languages, it often goes by the name do while, or something similar. The following example[23] will execute until you press Escape, quit R, or the universe ends, whichever happens soonest:

repeat
{
  message("Happy Groundhog Day!")
}

In general, we want our code to complete before the end of the universe, so it is possible to break out of the infinite loop by including a break statement. In the next example, sample returns one action in each iteration of the loop:

repeat
{
  message("Happy Groundhog Day!")
  action <- sample(
    c(
      "Learn French",
      "Make an ice statue",
      "Rob a bank",
      "Win heart of Andie McDowell"
    ),
    1
  )
  message("action = ", action)
  if(action == "Win heart of Andie McDowell") break
}
## Happy Groundhog Day!
## action = Rob a bank
## Happy Groundhog Day!
## action = Rob a bank
## Happy Groundhog Day!
## action = Rob a bank
## Happy Groundhog Day!
## action = Win heart of Andie McDowell

Sometimes, rather than breaking out of the loop we just want to skip the rest of the current iteration and start the next iteration:

repeat
{
  message("Happy Groundhog Day!")
  action <- sample(
    c(
      "Learn French",
      "Make an ice statue",
      "Rob a bank",
      "Win heart of Andie McDowell"
    ),
    1
  )
  if(action == "Rob a bank")
  {
    message("Quietly skipping to the next iteration")
    next
  }
  message("action = ", action)
  if(action == "Win heart of Andie McDowell") break
}
## Happy Groundhog Day!
## action = Learn French
## Happy Groundhog Day!
## Quietly skipping to the next iteration
## Happy Groundhog Day!
## Quietly skipping to the next iteration
## Happy Groundhog Day!
## action = Make an ice statue
## Happy Groundhog Day!
## action = Make an ice statue
## Happy Groundhog Day!
## Quietly skipping to the next iteration
## Happy Groundhog Day!
## action = Win heart of Andie McDowell

while Loops

while loops are like backward repeat loops. Rather than executing some code and then checking to see if the loop should end, they check first and then (maybe) execute. Since the check happens at the beginning, it is possible that the contents of the loop will never be executed (unlike in a repeat loop). The following example behaves similarly to the repeat example, except that if Andie McDowell’s heart is won straightaway, then the Groundhog Day loop is completely avoided:

action <- sample(
  c(
    "Learn French",
    "Make an ice statue",
    "Rob a bank",
    "Win heart of Andie McDowell"
  ),
  1
)
while(action != "Win heart of Andie McDowell")
{
  message("Happy Groundhog Day!")
  action <- sample(
    c(
      "Learn French",
      "Make an ice statue",
      "Rob a bank",
      "Win heart of Andie McDowell"
    ),
    1
  )
  message("action = ", action)
}
## Happy Groundhog Day!
## action = Make an ice statue
## Happy Groundhog Day!
## action = Learn French
## Happy Groundhog Day!
## action = Make an ice statue
## Happy Groundhog Day!
## action = Learn French
## Happy Groundhog Day!
## action = Make an ice statue
## Happy Groundhog Day!
## action = Win heart of Andie McDowell

With some fiddling, it is always possible to convert a repeat loop to a while loop or a while loop to a repeat loop, but usually the syntax is much cleaner one way or the other. If you know that the contents must execute at least once, use repeat; otherwise, use while.

for Loops

The third type of loop is to be used when you know exactly how many times you want the code to repeat. The for loop accepts an iterator variable and a vector. It repeats the loop, giving the iterator each element from the vector in turn. In the simplest case, the vector contains integers:

for(i in 1:5) message("i = ", i)
## i = 1
## i = 2
## i = 3
## i = 4
## i = 5

If you wish to execute multiple expressions, as with other loops they must be surrounded by curly braces:

for(i in 1:5)
{
  j <- i ^ 2
  message("j = ", j)
}
## j = 1
## j = 4
## j = 9
## j = 16
## j = 25

R’s for loops are particularly flexible in that they are not limited to integers, or even numbers in the input. We can pass character vectors, logical vectors, or lists:

for(month in month.name)
{
  message("The month of ", month)
}
## The month of January
## The month of February
## The month of March
## The month of April
## The month of May
## The month of June
## The month of July
## The month of August
## The month of September
## The month of October
## The month of November
## The month of December
for(yn in c(TRUE, FALSE, NA))
{
  message("This statement is ", yn)
}
## This statement is TRUE
## This statement is FALSE
## This statement is NA
l <- list(
  pi,
  LETTERS[1:5],
  charToRaw("not as complicated as it looks"),
  list(
    TRUE
  )
)
for(i in l)
{
  print(i)
}
## [1] 3.142
## [1] "A" "B" "C" "D" "E"
##  [1] 6e 6f 74 20 61 73 20 63 6f 6d 70 6c 69 63 61 74 65 64 20 61 73 20 69
## [24] 74 20 6c 6f 6f 6b 73
## [[1]]
## [1] TRUE

Since for loops operate on each element of a vector, they provide a sort of “pretend vectorization.” In fact, the vectorized operations in R will generally use some kind of for loop in internal C code. But be warned: R’s for loops will almost always run much slower than their vectorized equivalents, often by an order of magnitude or two. This means that you should try to use the vectorization capabilities wherever possible.[24]

Summary

  • You can conditionally execute statements using if and else.
  • The ifelse function is a vectorized equivalent of these.
  • R has three kinds of loops: repeat, while, and for.

Test Your Knowledge: Quiz

Question 8-1
What happens if you pass NA as a condition to if?
Question 8-2
What happens if you pass NA as a condition to ifelse?
Question 8-3
What types of variables can be passed as the first argument to the switch function?
Question 8-4
How do you stop a repeat loop executing?
Question 8-5
How do you jump to the next iteration of a loop?

Test Your Knowledge: Exercises

Exercise 8-1

In the game of craps, the player (the “shooter”) throws two six-sided dice. If the total is 2, 3, or 12, then the shooter loses. If the total is 7 or 11, she wins. If the total is any other score, then that score becomes the new target, known as the “point.” Use this utility function to generate a craps score:

two_d6 <- function(n)
{
  random_numbers <- matrix(
    sample(6, 2 * n, replace = TRUE),
    nrow = 2
  )
  colSums(random_numbers)
}

Write code that generates a craps score and assigns the following values to the variables game_status and point:

score game_status point

2, 3, 11

FALSE

NA

7, 11

TRUE

NA

4, 5, 6, 8, 9, 10

NA

Same as score

[10]

Exercise 8-2
If the shooter doesn’t immediately win or lose, then he must keep rolling the dice until he scores the point value and wins, or scores a 7 and loses. Write code that checks to see if the game status is NA, and if so, repeatedly generates a craps score until either the point value is scored (set game_status to TRUE) or a 7 is scored (set game_status to FALSE). [15]
Exercise 8-3

This is the text for the famous “sea shells” tongue twister:

sea_shells <- c(
  "She", "sells", "sea", "shells", "by", "the", "seashore",
  "The", "shells", "she", "sells", "are", "surely", "seashells",
  "So", "if", "she", "sells", "shells", "on", "the", "seashore",
  "I'm", "sure", "she", "sells", "seashore", "shells"
)

Use the nchar function to calculate the number of letters in each word. Now loop over possible word lengths, displaying a message about which words have that length. For example, at length six, you should state that the words “shells” and “surely” have six letters. [10]



[23] If these examples make no sense, please watch the movie.

[24] There is widespread agreement that if you write R code that looks like Fortran, you lose the right to complain that R is too slow.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset