Chapter 3. Unleashing Scala Performance

In this chapter, we will look at Scala-specific constructs and language features, and examine how they can help or hurt performance. Equipped with our newly-acquired performance measurement knowledge, we will analyze how to use the rich language features that are provided by the Scala programming language better. For each feature, we will introduce it, show you how it compiles to bytecode, and then identify caveats and other considerations when using this feature.

Throughout the chapter, we will show the Scala source code and generated bytecode that are emitted by the Scala compiler. It is necessary to inspect these artifacts to enrich your understanding of how Scala interacts with the JVM so that you can develop an intuition for the runtime performance of your software. We will inspect the bytecode by invoking the  javap Java disassembler after compiling the command, as follows:

javap -c <PATH_TO_CLASS_FILE>

The minus c switch prints the disassembled code. Another useful option is -private, which prints the bytecode of privately defined methods. For more information on javap, refer to the manual page. The examples that we will cover do not require in-depth JVM bytecode knowledge, but if you wish to learn more about bytecode operations, refer to Oracle's JVM specification at http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-3.html#jvms-3.4.

Periodically, we will also inspect a version of the Scala source code with Scala-specific features removed by running the following command:

scalac -print <PATH>

This is a useful way to see how the Scala compiler desugars convenient syntax into constructs that the JVM can execute. In this chapter, we will explore the following topics:

  • Value classes and tagged types
  • Specialization
  • Tuples
  • Pattern matching
  • Tail recursion
  • The Option data type
  • An alternative to Option

Value classes

In Chapter 2Measuring Performance on the JVM, we introduced the domain model of the order book application. This domain model included two classes, Price and OrderId. We pointed out that we created domain classes for Price and OrderId to provide contextual meanings to the wrapped BigDecimal and Long. While providing us with readable code and compilation time safety, this practice also increases the number of instances that are created by our application. Allocating memory and generating class instances create more work for the garbage collector by increasing the frequency of collections and by potentially introducing additional long-lived objects. The garbage collector will have to work harder to collect them, and this process may severely impact our latency.

Luckily, as of Scala 2.10, the AnyVal abstract class is available for developers to define their own value classes to solve this problem. The AnyVal class is defined in the Scala doc (http://www.scala-lang.org/api/current/#scala.AnyVal) as, "the root class of all value types, which describe values not implemented as objects in the underlying host system." The AnyVal class can be used to define a value class, which receives special treatment from the compiler. Value classes are optimized at compile time to avoid the allocation of an instance, and instead they use the wrapped type.

Bytecode representation

As an example, to improve the performance of our order book, we can define Price and OrderId as value classes:

case class Price(value: BigDecimal) extends AnyVal 
case class OrderId(value: Long) extends AnyVal 

To illustrate the special treatment of value classes, we define a dummy method taking a Price value class and an OrderId value class as arguments:

def printInfo(p: Price, oId: OrderId): Unit = 
  println(s"Price: ${p.value}, ID: ${oId.value}") 

From this definition, the compiler produces the following method signature:

public void printInfo(scala.math.BigDecimal, long); 

We see that the generated signature takes a BigDecimal object and a long object, even though the Scala code allows us to take advantage of the types defined in our model. This means that we cannot use an instance of BigDecimal or Long when calling printInfo because the compiler will throw an error.

Note

An interesting thing to notice is that the second parameter of printInfo is not compiled as Long (an object), but long (a primitive type, note the lower case 'l').  Long and other objects matching to primitive types, such as IntFloat or Short, are specially handled by the compiler to be represented by their primitive type at runtime.

Value classes can also define methods. Let's enrich our Price class, as follows:

case class Price(value: BigDecimal) extends AnyVal { 
  def lowerThan(p: Price): Boolean = this.value < p.value 
} 
 
// Example usage 
val p1 = Price(BigDecimal(1.23)) 
val p2 = Price(BigDecimal(2.03)) 
p1.lowerThan(p2) // returns true 

Our new method allows us to compare two instances of Price. At compile time, a companion object is created for Price. This companion object defines a lowerThan method that takes two BigDecimal objects as parameters. In reality, when we call lowerThan on an instance of Price, the code is transformed by the compiler from an instance method call to a static method call that is defined in the companion object:

public final boolean lowerThan$extension(scala.math.BigDecimal, scala.math.BigDecimal); 
    Code: 
       0: aload_1 
       1: aload_2 
       2: invokevirtual #56  // Method scala/math/BigDecimal.$less:(Lscala/math/BigDecimal;)Z 
       5: ireturn 

If we were to write the pseudo-code equivalent to the preceding Scala code, it would look something like the following:

val p1 = BigDecimal(1.23) 
val p2 = BigDecimal(2.03) 
Price.lowerThan(p1, p2)  // returns true 

Performance considerations

Value classes are a great addition to our developer toolbox. They help us reduce the count of instances and spare some work for the garbage collector, while allowing us to rely on meaningful types that reflect our business abstractions. However, extending AnyVal comes with a certain set of conditions that the class must fulfill. For example, a value class may only have one primary constructor that takes one public val as a single parameter. Furthermore, this parameter cannot be a value class. We saw that value classes can define methods via def. Neither val nor var is allowed inside a value class. A nested class or object definitions are also impossible. Another limitation prevents value classes from extending anything other than a universal trait, that is, a trait that extends Any, only has defs as members, and performs no initialization. If any of these conditions are not fulfilled, the compiler generates an error. In addition to the preceding constraints that are listed, there are special cases in which a value class has to be instantiated by the JVM. Such cases include performing a pattern matching or runtime type test, or assigning a value class to an array. An example of the latter looks like the following snippet:

def newPriceArray(count: Int): Array[Price] = { 
  val a = new Array[Price](count) 
  for(i <- 0 until count){ 
    a(i) = Price(BigDecimal(Random.nextInt())) 
  } 
  a 
} 

The generated bytecode is as follows:

public highperfscala.anyval.ValueClasses$$anonfun$newPriceArray$1(highperfscala.anyval.ValueClasses$Price[]); 
    Code: 
       0: aload_0 
       1: aload_1 
       2: putfield      #29  // Field a$1:[Lhighperfscala/anyval/ValueClasses$Price; 
       5: aload_0 
       6: invokespecial #80  // Method scala/runtime/AbstractFunction1$mcVI$sp."<init>":()V 
       9: return 
 
public void apply$mcVI$sp(int); 
    Code: 
       0: aload_0 
       1: getfield      #29  // Field a$1:[Lhighperfscala/anyval/ValueClasses$Price; 
       4: iload_1 
       5: new           #31  // class highperfscala/anyval/ValueClasses$Price 
       // omitted for brevity 
      21: invokevirtual #55  // Method scala/math/BigDecimal$.apply:(I)Lscala/math/BigDecimal; 
      24: invokespecial #59  // Method highperfscala/anyval/ValueClasses$Price."<init>":(Lscala/math/BigDecimal;)V 
      27: aastore 
      28: return 

Notice how mcVI$sp is invoked from newPriceArray, and this creates a new instance of ValueClasses$Price at the 5 instruction.

As turning a single field case class into a value class is as trivial as extending the AnyVal trait, we recommend that you always use AnyVal wherever possible. The overhead is quite low, and it generate high benefits in terms of garbage collection's performance. To learn more about value classes, their limitations, and use cases, you can find detailed descriptions at http://docs.scala-lang.org/overviews/core/value-classes.html.

Tagged types - an alternative to value classes

Value classes are an easy to use tool, and they can yield great improvements in terms of performance. However, they come with a constraining set of conditions, which can make them impossible to use in certain cases. We will conclude this section with a glance at an interesting alternative by leveraging the tagged type feature that is implemented by the Scalaz library (https://github.com/scalaz/scalaz).

Note

The Scalaz implementation of tagged types is inspired by another Scala library, named shapeless. The shapeless library provides tools to write type-safe, generic code with minimal boilerplate. While we will not explore shapeless, we encourage you to learn more about the project at https://github.com/milessabin/shapeless.

Tagged types are another way to enforce compile-type checking without incurring the cost of instance instantiation. They rely on the Tagged structural type and the @@ type alias that are defined in the Scalaz library, as follows:

type Tagged[U] = { type Tag = U } 
type @@[T, U] = T with Tagged[U] 

Let's rewrite part of our code to leverage tagged types with our Price object:

object TaggedTypes { 
 
  sealed trait PriceTag 
  type Price = BigDecimal @@ PriceTag 
 
  object Price { 
    def newPrice(p: BigDecimal): Price = 
      Tag[BigDecimal, PriceTag](p) 
 
    def lowerThan(a: Price, b: Price): Boolean = 
      Tag.unwrap(a) < Tag.unwrap(b) 
  } 
} 

Let's perform a short walkthrough of the code snippet. We will define a PriceTag sealed trait that we will use to tag our instances, a Price type alias is created and defined as a BigDecimal object tagged with PriceTag. The Price object defines useful methods, including the newPrice factory function that is used to tag a given BigDecimal object and return a Price object (that is, a tagged BigDecimal object). We will also implement an equivalent to the lowerThan method. This function takes two Price objects (that is, two tagged BigDecimal objects), extracts the content of the tags that are two BigDecimal objects, and compares them.

Using our new Price type, we rewrite the same newPriceArray method that we previously looked at (the code is omitted for brevity, but you can refer to it in the attached source code), and print the following generated bytecode:

public void apply$mcVI$sp(int); 
    Code: 
       0: aload_0 
       1: getfield      #29  // Field a$1:[Ljava/lang/Object; 
       4: iload_1 
       5: getstatic     #35  // Field highperfscala/anyval/TaggedTypes$Price$.MODULE$:Lhighperfscala/anyval/TaggedTypes$Price$; 
       8: getstatic     #40  // Field scala/package$.MODULE$:Lscala/package$; 
      11: invokevirtual #44  // Method scala/package$.BigDecimal:()Lscala/math/BigDecimal$; 
      14: getstatic     #49  // Field scala/util/Random$.MODULE$:Lscala/util/Random$; 
      17: invokevirtual #53  // Method scala/util/Random$.nextInt:()I 
      20: invokevirtual #58  // Method scala/math/BigDecimal$.apply:(I)Lscala/math/BigDecimal; 
      23: invokevirtual #62  // Method highperfscala/anyval/TaggedTypes$Price$.newPrice:(Lscala/math/BigDecimal;)Ljava/lang/Object; 
      26: aastore 
      27: return 

In this version, we no longer see an instantiation of Price, even though we are assigning it to an array. The tagged Price implementation involves a runtime cast, but we anticipate that the cost of this cast will be less than the instance allocations (and garbage collection) observed in the previous value class Price strategy. We will look  at tagged types again later in this chapter, and use them to replace a well-known tool of the standard library: the Option.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset