Tuples

First-class tuple support in Scala simplifies use cases where multiple values need to be grouped together. With tuples, you can elegantly return multiple values using a concise syntax without defining a case class. The following section shows how the compiler translates Scala tuples.

Bytecode representation

Let's look at how the JVM handles creating a tuple to understand how the JVM supports tuples better. To develop our intuition, consider creating a tuple with an arity of two, as follows:

def tuple2: (Int, Double) = (1, 2.0) 

The corresponding bytecode for this method is as follows:

  public scala.Tuple2<java.lang.Object, java.lang.Object> tuple2(); 
    Code: 
       0: new           #36  // class scala/Tuple2$mcID$sp 
       3: dup 
       4: iconst_1 
       5: ldc2_w        #37  // double 2.0d 
       8: invokespecial #41  // Method scala/Tuple2$mcID$sp."<init>":(ID)V 
      11: areturn 

This bytecode shows that the compiler desugared the parenthesis tuple definition syntax into the allocation of a class named Tuple2. There is a tuple class that is defined for each supported arity (for example, Tuple5 supports five members) up to Tuple22. The bytecode also shows at the  4 and 5 instructions that the primitive versions of Int and Double are used to allocate this tuple instance.

Performance considerations

In the preceding example, Tuple2 avoids the boxing of primitives due to specialization on the two generic types. It is often convenient to tuple multiple values together because of Scala's expressive tupling syntax. However, this leads to excessive memory allocation because tuples with an arity larger than two are not specialized. Here is an example to illustrate this concern:

def tuple3: (Int, Double, Int) = (1, 2.0, 3) 

This definition is analogous to the first tuple definition that we reviewed, except that there is now an arity of three. This definition produces the following bytecode:

  public scala.Tuple3<java.lang.Object, java.lang.Object, java.lang.Object> tuple3(); 
    Code: 
       0: new           #45  // class scala/Tuple3 
       3: dup 
       4: iconst_1 
       5: invokestatic  #24  // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer; 
       8: ldc2_w        #37  // double 2.0d 
      11: invokestatic  #49  // Method scala/runtime/BoxesRunTime.boxToDouble:(D)Ljava/lang/Double; 
      14: iconst_3 
      15: invokestatic  #24  // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer; 
      18: invokespecial #52  // Method scala/Tuple3."<init>":(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)V 
      21: areturn 

In this bytecode, the absence of specialization is clear because of the presence of integer and double boxing. If you are working on a performance-sensitive region of your application and find occurrences of tuples with an arity of three or larger, you should consider defining a case class to avoid the boxing overhead. The definition of your case class will not have any generics. This enables the JVM to use primitives instead of allocating objects on the heap for the primitive tuple members.

Even when using Tuple2, it is still possible that you are incurring the cost of boxing. Consider the following snippet:

case class Bar(value: Int) extends AnyVal 
def tuple2Boxed: (Int, Bar) = (1, Bar(2)) 

Given what we know about the bytecode representation of Tuple2 and value classes, we expect the bytecode for this method to be two stack-allocated integers. Unfortunately, in this case, the resulting bytecode is as follows:

  public scala.Tuple2<java.lang.Object, highperfscala.patternmatch.PatternMatching$Bar> tuple2Boxed(); 
    Code: 
       0: new           #18  // class scala/Tuple2 
       3: dup 
       4: iconst_1 
       5: invokestatic  #24  // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer; 
       8: new           #26  // class highperfscala.patternmatch/PatternMatching$Bar 
      11: dup 
      12: iconst_2 
      13: invokespecial #29  // Method highperfscala.patternmatch/PatternMatching$Bar."<init>":(I)V 
      16: invokespecial #32  // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V 
      19: areturn 

In the preceding bytecode, we see that the integer is boxed and an instance of Bar is instantiated. This example is analogous to the final specialization example that we investigated involving Container2. Looking back at that example, it should be evident that Container2 is a close analog to Tuple2. As before, due to how specialization is implemented by the compiler, the compiler is unable to avoid boxing in this scenario. If you are faced with performance-sensitive code, the workaround remains defining a case class. Here is proof that defining a case class erases the undesired value class instantiation and primitive boxing:

 case class IntBar(i: Int, b: Bar) 
 def intBar: IntBar = IntBar(1, Bar(2)) 

This definition produces the following bytecode:

  public highperfscala.patternmatch.PatternMatching$IntBar intBar(); 
    Code: 
       0: new           #18  // class highperfscala.patternmatch/PatternMatching$IntBar 
       3: dup 
       4: iconst_1 
       5: iconst_2 
       6: invokespecial #21  // Method highperfscala.patternmatch/PatternMatching$IntBar."<init>":(II)V 
       9: areturn 

Note that IntBar is not defined as a value class because it has two parameters. In contrast to the tuple definition, there is neither boxing nor any reference to the Bar value class. In this scenario, defining a case class is a performance win for performance-sensitive code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset