Puzzler 7

Caught Up in Closures

Function values, especially anonymous functions, provide a convenient and concise way to create and pass around "portable" snippets of code. This is enhanced by allowing these snippets to reference values in scope when the function is defined, beyond just the immediate function parameters.

The following code creates "delayed accessors" for a set of values and invokes them later. What is the result of executing this code in the REPL?

  import collection.mutable.Buffer
  
val accessors1 = Buffer.empty[() => Int] val accessors2 = Buffer.empty[() => Int]
val data = Seq(100110120) var j = 0  for (i <- 0 until data.length) {    accessors1 += (() => data(i))   accessors2 += (() => data(j))   j += 1  } 
accessors1.foreach(a1 => println(a1())) accessors2.foreach(a2 => println(a2()))

Possibilities

  1. The first statement prints:
      100
      110
      120
    

    and the second throws an IndexOutOfBoundsException.

  2. Both statements print:
      100
      110
      120
    
  3. Both statements fail to compile with the error message: "not found: value data."
  4. Both statements print:
      120
      120
      120
    

Explanation

Since data, i, and j are no longer in scope when the functions are invoked, you may wonder whether the code compiles at all. Or you may wonder whether the functions all see the last value of data(i) and data(j), and, as a result, both print:

  120
  120
  120

As it happens, the code does compile, and the first statement prints the expected values 100, 110, 120. The second statement never gets going, immediately throwing a runtime exception:

  scala> accessors1.foreach(a1 => println(a1()))
  100
  110
  120
  
scala> accessors2.foreach(a2 => println(a2())) java.lang.IndexOutOfBoundsException: 3   at scala.collection.LinearSeqOptimized$class.apply(     LinearSeqOptimized.scala:51)   at scala.collection.immutable.List.apply(List.scala:85)   at $anonfun$1$$anonfun$apply$mcVI$sp$2.apply$mcI$sp(     <console>:16)   at $anonfun$1.apply(<console>:10)   ...

The correct answer, therefore, is number 1.

Before examining which differences between i and j result in the observed behavior, it is helpful to look at how Scala enables the function body to access these variables at all.

Scala allows the body of a function to reference variables that are not explicit function parameters, but are in scope at the moment the function is constructed. To access these free variables when the function is invoked in a different scope, Scala "closes over" them to create a closure.

Closing over a free variable is not taking a "snapshot" of the variable's value when it is used. Instead, a field referencing the captured variable is added to the function object. Crucially for this case, while captured vals are simply represented by the value, capturing a var results in a reference to the var itself.

As an illustration, consider the following fun method:

  def fun: () => Int = {
    val i = 1
    var j = 2
    () => i + j
  }

The function returned from the fun method is a closure that captures one val, i, and one var, j. You can examine how the compiler treats i and j differently by invoking scala with the -print option, which prints the code with all Scala-specific features removed:

  def fun(): Function0 = {
    val i: Int = 1;
    var j: runtime.IntRef = new runtime.IntRef(2);
    {
      (new anonymous class anonfunfun1(Illustration.this,
          i, j): Function0)
    }
  };

Do you see the difference? The val is stored as a regular Int; the var instead becomes a scala.runtime.IntRef, a reference to a mutable (Java) int.

From here, the explanation for the observed behavior is straightforward: when each accessors1 function is created, it captures the current value of i, and so prints the expected results when invoked. The accessors2 functions, on the other hand, each capture a reference to a mutable IntRef object containing the value of j, which can change over time.

By the time the first accessors2 function is invoked, the value of j is already 3. Since data only has three elements, invoking data(j) triggers an IndexOutOfBoundsException.

Discussion

The most robust way to prevent this problem is to avoid vars, which is also better Scala style. If you can't avoid a var, but you still want a closure to capture its value at the time the closure is created, you can "freeze" the var by assigning its value to a temporary val. Here's an example:

  import collection.mutable.Buffer
  
val accessors2 = Buffer.empty[() => Int]
val data = Seq(100110120) var j = 0 for (i <- 0 until data.length) {   val currentJ = j   accessors2 += (() => data(currentJ))   j += 1 }
  scala> accessors2.foreach(a2 => println(a2()))
  100
  110
  120
image images/moralgraphic117px.png Avoid capturing free variables in your closures that refer to anything mutable—vars or mutable objects. If you need to close over anything mutable, extract a stable value and assign it to a val, then use that val in your function.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset