In this chapter, we will look at Scala-specific constructs and language features, and examine how they can help or hurt performance. Equipped with our newly-acquired performance measurement knowledge, we will analyze how to use the rich language features that are provided by the Scala programming language better. For each feature, we will introduce it, show you how it compiles to bytecode, and then identify caveats and other considerations when using this feature.
Throughout the chapter, we will show the Scala source code and generated bytecode that are emitted by the Scala compiler. It is necessary to inspect these artifacts to enrich your understanding of how Scala interacts with the JVM so that you can develop an intuition for the runtime performance of your software. We will inspect the bytecode by invoking the javap
Java disassembler after compiling the command, as follows:
javap -c <PATH_TO_CLASS_FILE>
The minus c
switch prints the disassembled code. Another useful option is -private
, which prints the bytecode of privately defined methods. For more information on javap
, refer to the manual page. The examples that we will cover do not require in-depth JVM bytecode knowledge, but if you wish to learn more about bytecode operations, refer to Oracle's JVM specification at http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-3.html#jvms-3.4.
Periodically, we will also inspect a version of the Scala source code with Scala-specific features removed by running the following command:
scalac -print <PATH>
This is a useful way to see how the Scala compiler desugars convenient syntax into constructs that the JVM can execute. In this chapter, we will explore the following topics:
Option
data typeOption
In Chapter 2, Measuring Performance on the JVM, we introduced the domain model of the order book application. This domain model included two classes, Price
and OrderId
. We pointed out that we created domain classes for Price
and OrderId
to provide contextual meanings to the wrapped BigDecimal
and Long
. While providing us with readable code and compilation time safety, this practice also increases the number of instances that are created by our application. Allocating memory and generating class instances create more work for the garbage collector by increasing the frequency of collections and by potentially introducing additional long-lived objects. The garbage collector will have to work harder to collect them, and this process may severely impact our latency.
Luckily, as of Scala 2.10, the AnyVal
abstract class is available for developers to define their own value classes to solve this problem. The AnyVal
class is defined in the Scala doc (http://www.scala-lang.org/api/current/#scala.AnyVal) as, "the root class of all value types, which describe values not implemented as objects in the underlying host system." The AnyVal
class can be used to define a value class, which receives special treatment from the compiler. Value classes are optimized at compile time to avoid the allocation of an instance, and instead they use the wrapped type.
As an example, to improve the performance of our order book, we can define Price
and OrderId
as value classes:
case class Price(value: BigDecimal) extends AnyVal case class OrderId(value: Long) extends AnyVal
To illustrate the special treatment of value classes, we define a dummy method taking a Price
value class and an OrderId
value class as arguments:
def printInfo(p: Price, oId: OrderId): Unit = println(s"Price: ${p.value}, ID: ${oId.value}")
From this definition, the compiler produces the following method signature:
public void printInfo(scala.math.BigDecimal, long);
We see that the generated signature takes a BigDecimal
object and a long
object, even though the Scala code allows us to take advantage of the types defined in our model. This means that we cannot use an instance of BigDecimal
or Long
when calling printInfo
because the compiler will throw an error.
An interesting thing to notice is that the second parameter of printInfo
is not compiled as Long
(an object), but long
(a primitive type, note the lower case 'l'). Long
and other objects matching to primitive types, such as Int
, Float
or Short
, are specially handled by the compiler to be represented by their primitive type at runtime.
Value classes can also define methods. Let's enrich our Price
class, as follows:
case class Price(value: BigDecimal) extends AnyVal { def lowerThan(p: Price): Boolean = this.value < p.value } // Example usage val p1 = Price(BigDecimal(1.23)) val p2 = Price(BigDecimal(2.03)) p1.lowerThan(p2) // returns true
Our new method allows us to compare two instances of Price
. At compile time, a companion object is created for Price
. This companion object defines a lowerThan
method that takes two BigDecimal
objects as parameters. In reality, when we call lowerThan
on an instance of Price
, the code is transformed by the compiler from an instance method call to a static method call that is defined in the companion object:
public final boolean lowerThan$extension(scala.math.BigDecimal, scala.math.BigDecimal); Code: 0: aload_1 1: aload_2 2: invokevirtual #56 // Method scala/math/BigDecimal.$less:(Lscala/math/BigDecimal;)Z 5: ireturn
If we were to write the pseudo-code equivalent to the preceding Scala code, it would look something like the following:
val p1 = BigDecimal(1.23) val p2 = BigDecimal(2.03) Price.lowerThan(p1, p2) // returns true
Value classes are a great addition to our developer toolbox. They help us reduce the count of instances and spare some work for the garbage collector, while allowing us to rely on meaningful types that reflect our business abstractions. However, extending AnyVal
comes with a certain set of conditions that the class must fulfill. For example, a value class may only have one primary constructor that takes one public val
as a single parameter. Furthermore, this parameter cannot be a value class. We saw that value classes can define methods via def
. Neither val
nor var
is allowed inside a value class. A nested class or object definitions are also impossible. Another limitation prevents value classes from extending anything other than a universal trait, that is, a trait that extends Any
, only has defs
as members, and performs no initialization. If any of these conditions are not fulfilled, the compiler generates an error. In addition to the preceding constraints that are listed, there are special cases in which a value class has to be instantiated by the JVM. Such cases include performing a pattern matching or runtime type test, or assigning a value class to an array. An example of the latter looks like the following snippet:
def newPriceArray(count: Int): Array[Price] = { val a = new Array[Price](count) for(i <- 0 until count){ a(i) = Price(BigDecimal(Random.nextInt())) } a }
The generated bytecode is as follows:
public highperfscala.anyval.ValueClasses$$anonfun$newPriceArray$1(highperfscala.anyval.ValueClasses$Price[]); Code: 0: aload_0 1: aload_1 2: putfield #29 // Field a$1:[Lhighperfscala/anyval/ValueClasses$Price; 5: aload_0 6: invokespecial #80 // Method scala/runtime/AbstractFunction1$mcVI$sp."<init>":()V 9: return public void apply$mcVI$sp(int); Code: 0: aload_0 1: getfield #29 // Field a$1:[Lhighperfscala/anyval/ValueClasses$Price; 4: iload_1 5: new #31 // class highperfscala/anyval/ValueClasses$Price // omitted for brevity 21: invokevirtual #55 // Method scala/math/BigDecimal$.apply:(I)Lscala/math/BigDecimal; 24: invokespecial #59 // Method highperfscala/anyval/ValueClasses$Price."<init>":(Lscala/math/BigDecimal;)V 27: aastore 28: return
Notice how mcVI$sp
is invoked from newPriceArray
, and this creates a new instance of ValueClasses$Price
at the 5
instruction.
As turning a single field case class into a value class is as trivial as extending the AnyVal
trait, we recommend that you always use AnyVal
wherever possible. The overhead is quite low, and it generate high benefits in terms of garbage collection's performance. To learn more about value classes, their limitations, and use cases, you can find detailed descriptions at http://docs.scala-lang.org/overviews/core/value-classes.html.
Value classes are an easy to use tool, and they can yield great improvements in terms of performance. However, they come with a constraining set of conditions, which can make them impossible to use in certain cases. We will conclude this section with a glance at an interesting alternative by leveraging the tagged type feature that is implemented by the Scalaz
library (https://github.com/scalaz/scalaz).
The Scalaz
implementation of tagged types is inspired by another Scala library, named shapeless
. The shapeless
library provides tools to write type-safe, generic code with minimal boilerplate. While we will not explore shapeless
, we encourage you to learn more about the project at https://github.com/milessabin/shapeless.
Tagged types are another way to enforce compile-type checking without incurring the cost of instance instantiation. They rely on the Tagged
structural type and the @@
type alias that are defined in the Scalaz
library, as follows:
type Tagged[U] = { type Tag = U } type @@[T, U] = T with Tagged[U]
Let's rewrite part of our code to leverage tagged types with our Price
object:
object TaggedTypes { sealed trait PriceTag type Price = BigDecimal @@ PriceTag object Price { def newPrice(p: BigDecimal): Price = Tag[BigDecimal, PriceTag](p) def lowerThan(a: Price, b: Price): Boolean = Tag.unwrap(a) < Tag.unwrap(b) } }
Let's perform a short walkthrough of the code snippet. We will define a PriceTag
sealed trait that we will use to tag our instances, a Price
type alias is created and defined as a BigDecimal
object tagged with PriceTag
. The Price
object defines useful methods, including the newPrice
factory function that is used to tag a given BigDecimal
object and return a Price
object (that is, a tagged BigDecimal
object). We will also implement an equivalent to the lowerThan
method. This function takes two Price
objects (that is, two tagged BigDecimal
objects), extracts the content of the tags that are two BigDecimal
objects, and compares them.
Using our new Price
type, we rewrite the same newPriceArray
method that we previously looked at (the code is omitted for brevity, but you can refer to it in the attached source code), and print the following generated bytecode:
public void apply$mcVI$sp(int); Code: 0: aload_0 1: getfield #29 // Field a$1:[Ljava/lang/Object; 4: iload_1 5: getstatic #35 // Field highperfscala/anyval/TaggedTypes$Price$.MODULE$:Lhighperfscala/anyval/TaggedTypes$Price$; 8: getstatic #40 // Field scala/package$.MODULE$:Lscala/package$; 11: invokevirtual #44 // Method scala/package$.BigDecimal:()Lscala/math/BigDecimal$; 14: getstatic #49 // Field scala/util/Random$.MODULE$:Lscala/util/Random$; 17: invokevirtual #53 // Method scala/util/Random$.nextInt:()I 20: invokevirtual #58 // Method scala/math/BigDecimal$.apply:(I)Lscala/math/BigDecimal; 23: invokevirtual #62 // Method highperfscala/anyval/TaggedTypes$Price$.newPrice:(Lscala/math/BigDecimal;)Ljava/lang/Object; 26: aastore 27: return
In this version, we no longer see an instantiation of Price
, even though we are assigning it to an array. The tagged Price
implementation involves a runtime cast, but we anticipate that the cost of this cast will be less than the instance allocations (and garbage collection) observed in the previous value class Price
strategy. We will look at tagged types again later in this chapter, and use them to replace a well-known tool of the standard library: the Option
.