Function values, especially anonymous functions, provide a convenient and concise way to create and pass around "portable" snippets of code. This is enhanced by allowing these snippets to reference values in scope when the function is defined, beyond just the immediate function parameters.
The following code creates "delayed accessors" for a set of values and invokes them later. What is the result of executing this code in the REPL?
import collection.mutable.Buffer
val accessors1 = Buffer.empty[() => Int] val accessors2 = Buffer.empty[() => Int]
val data = Seq(100, 110, 120) var j = 0 for (i <- 0 until data.length) { accessors1 += (() => data(i)) accessors2 += (() => data(j)) j += 1 }
accessors1.foreach(a1 => println(a1())) accessors2.foreach(a2 => println(a2()))
100 110 120
and the second throws an IndexOutOfBoundsException.
100 110 120
120 120 120
Since data, i, and j are no longer in scope when the functions are invoked, you may wonder whether the code compiles at all. Or you may wonder whether the functions all see the last value of data(i) and data(j), and, as a result, both print:
120 120 120
As it happens, the code does compile, and the first statement prints the expected values 100, 110, 120. The second statement never gets going, immediately throwing a runtime exception:
scala> accessors1.foreach(a1 => println(a1())) 100 110 120
scala> accessors2.foreach(a2 => println(a2())) java.lang.IndexOutOfBoundsException: 3 at scala.collection.LinearSeqOptimized$class.apply( LinearSeqOptimized.scala:51) at scala.collection.immutable.List.apply(List.scala:85) at $anonfun$1$$anonfun$apply$mcVI$sp$2.apply$mcI$sp( <console>:16) at $anonfun$1.apply(<console>:10) ...
The correct answer, therefore, is number 1.
Before examining which differences between i and j result in the observed behavior, it is helpful to look at how Scala enables the function body to access these variables at all.
Scala allows the body of a function to reference variables that are not explicit function parameters, but are in scope at the moment the function is constructed. To access these free variables when the function is invoked in a different scope, Scala "closes over" them to create a closure.
Closing over a free variable is not taking a "snapshot" of the variable's value when it is used. Instead, a field referencing the captured variable is added to the function object. Crucially for this case, while captured vals are simply represented by the value, capturing a var results in a reference to the var itself.
As an illustration, consider the following fun method:
def fun: () => Int = { val i = 1 var j = 2 () => i + j }
The function returned from the fun method is a closure that captures one val, i, and one var, j. You can examine how the compiler treats i and j differently by invoking scala with the -print option, which prints the code with all Scala-specific features removed:
def fun(): Function0 = { val i: Int = 1; var j: runtime.IntRef = new runtime.IntRef(2); { (new anonymous class anonfunfun1(Illustration.this, i, j): Function0) } };
Do you see the difference? The val is stored as a regular Int; the var instead becomes a scala.runtime.IntRef, a reference to a mutable (Java) int.
From here, the explanation for the observed behavior is straightforward: when each accessors1 function is created, it captures the current value of i, and so prints the expected results when invoked. The accessors2 functions, on the other hand, each capture a reference to a mutable IntRef object containing the value of j, which can change over time.
By the time the first accessors2 function is invoked, the value of j is already 3. Since data only has three elements, invoking data(j) triggers an IndexOutOfBoundsException.
The most robust way to prevent this problem is to avoid vars, which is also better Scala style. If you can't avoid a var, but you still want a closure to capture its value at the time the closure is created, you can "freeze" the var by assigning its value to a temporary val. Here's an example:
import collection.mutable.Buffer
val accessors2 = Buffer.empty[() => Int]
val data = Seq(100, 110, 120) var j = 0 for (i <- 0 until data.length) { val currentJ = j accessors2 += (() => data(currentJ)) j += 1 }
scala> accessors2.foreach(a2 => println(a2())) 100 110 120
Avoid capturing free variables in your closures that refer to anything mutable—vars or mutable objects. If you need to close over anything mutable, extract a stable value and assign it to a val, then use that val in your function. |