Pattern matching

For programmers who are new to Scala, pattern matching is often one of the language features that is the simplest to understand, but it also unlocks new ways to think about writing software. This powerful mechanism enables you to match on disparate types with compile-time safety using an elegant syntax. Given how central this technique is to writing Scala in the functional paradigm, it is important to consider its runtime overhead.

Bytecode representation

Let's consider an example that involves order processing with an algebraic data type representing the possible sides of an order:

  sealed trait Side 
  case object Buy extends Side 
  case object Sell extends Side 
  def handleOrder(s: Side): Boolean = s match { 
    case Buy => true 
    case Sell => false 
  } 

Note

The terminology algebraic data type (ADT) is a more formal way of referring to a sealed trait and its cases. For example, SideBuy, and Sell form an ADT. For our purposes, an ADT defines a closed set of cases. For Side, the enclosed cases are Buy and Sell. The sealed modifier provides closed set semantics because it disallows the extension of Side in separate source files. The closed set semantics implied by an ADT is what allows the compiler to infer whether or not a pattern match statement is exhaustive. If you are interested in studying another example of an ADT, view the order book commands defined in Chapter 2Measuring Performance on the JVM.

As shown in the following bytecode, pattern matching is desugared into a set of if statements:

 public boolean handleOrder(highperfscala.patternmatch.PatternMatching$Side); 
    Code: 
       0: aload_1 
       1: astore_2 
       2: getstatic     #148  // Field highperfscala.patternmatch/PatternMatching$Buy$.MODULE$:Lhighperfscala.patternmatch/PatternMatching$Buy$; 
       5: aload_2 
       6: invokevirtual #152  // Method java/lang/Object.equals:(Ljava/lang/Object;)Z 
       9: ifeq          17 
      12: iconst_1 
      13: istore_3 
      14: goto          29 
      17: getstatic     #157  // Field highperfscala.patternmatch/PatternMatching$Sell$.MODULE$:Lhighperfscala.patternmatch/PatternMatching$Sell$; 
      20: aload_2 
      21: invokevirtual #152  // Method java/lang/Object.equals:(Ljava/lang/Object;)Z 
      24: ifeq          31 
      27: iconst_0 
      28: istore_3 
      29: iload_3 
      30: ireturn 
      31: new           #159  // class scala/MatchError 
      34: dup 
      35: aload_2 
      36: invokespecial #160  // Method scala/MatchError."<init>":(Ljava/lang/Object;)V 
      39: athrow 

Inspecting the bytecode shows how the Scala compiler is able to desugar pattern match expressions into a set of efficient if statements with the ifeq instructions at the 9 and 24 indexes. This an illustrative example of how Scala is able to provide expressive and elegant first-class language features that retain efficient bytecode equivalents.

Performance considerations

Pattern matching against values that contain state (for example, a case class) imposes additional runtime costs that are not immediately clear when looking at the Scala source code. Consider the following extension to the previous example that introduces state:

  sealed trait Order 
  case class BuyOrder(price: Double) extends Order 
  case class SellOrder(price: Double) extends Order 
  def handleOrder(o: Order): Boolean = o match { 
    case BuyOrder(price) if price > 2.0 => true 
    case BuyOrder(_) => false 
    case SellOrder(_) => false 
  } 

Here, the example is more complicated because the instance type must be identified for all three cases with the added complexity of a predicate on the BuyOrder price in the first case. In the following, we look at a snippet of the scalac output with all Scala-specific features removed:

        case10(){ 
          if (x1.$isInstanceOf[highperfscala.patternmatch.PatternMatching$BuyOrder]()) 
            { 
              rc8 = true; 
              x2 = (x1.$asInstanceOf[highperfscala.patternmatch.PatternMatching$BuyOrder](): highperfscala.patternmatch.PatternMatching$BuyOrder); 
              { 
                val price: Double = x2.price(); 
                if (price.>(2.0)) 
                  matchEnd9(true) 
                else 
                  case11() 
              } 
            } 
          else 
            case11() 
        }; 
        case11(){ 
          if (rc8) 
            matchEnd9(false) 
          else 
            case12() 
        }; 

This desugaring illustrates several interesting points about the Scala compiler. Identifying the type of Order utilizes isInstanceOf from java.lang.Object, which maps to the instanceOf bytecode instruction. Casting, by way of asInstanceOf, coerces the Order into either a BuyOrder price or a SellOrder.  The first takeaway is that pattern matching types carrying state incurs the runtime cost of type-checking and casting.

A second insight is that the Scala compiler is able to optimize away the instance checking for the second pattern match by creating a Boolean variable named rc8 to determine whether a BuyOrder was discovered. This neat optimization is simple to handwrite, but it removes the elegance and simplicity of pattern matching. This is another example of how the compiler is able to produce efficient bytecode from expressive, high-level code.

From the preceding examples, it is now clear that pattern matches are compiled to if statements. One performance consideration for critical path code is the ordering of pattern match statements. If your code has five pattern match statements and the fifth pattern is the most frequently accessed, then your code is paying the price of always evaluating four other branches. Let's devise a JMH microbenchmark that estimates the linear access cost of pattern matching. Each benchmark defines ten pattern matches using different values (for example, the value class, the integer literal, the case class, and so on). For each benchmark, the matched index is swept to show the cost of accessing the first, the fifth, and the the tenth pattern match statement. Here is the benchmark definition:

class PatternMatchingBenchmarks { 
 
  @Benchmark 
  def matchIntLiterals(i: PatternMatchState): Int = i.matchIndex match { 
    case 1 => 1 
    case 2 => 2 
    case 3 => 3 
    case 4 => 4 
    case 5 => 5 
    case 6 => 6 
    case 7 => 7 
    case 8 => 8 
    case 9 => 9 
    case 10 => 10 
  } 
 
  @Benchmark 
  def matchIntVariables(ii: PatternMatchState): Int = ii.matchIndex match { 
    case `a` => 1 
    case `b` => 2 
    case `c` => 3 
    case `d` => 4 
    case `e` => 5 
    case `f` => 6 
    case `g` => 7 
    case `h` => 8 
    case `i` => 9 
    case `j` => 10 
  } 
 
  @Benchmark 
  def matchAnyVal(i: PatternMatchState): Int = CheapFoo(i.matchIndex) match { 
    case CheapFoo(1) => 1 
    case CheapFoo(2) => 2 
    case CheapFoo(3) => 3 
    case CheapFoo(4) => 4 
    case CheapFoo(5) => 5 
    case CheapFoo(6) => 6 
    case CheapFoo(7) => 7 
    case CheapFoo(8) => 8 
    case CheapFoo(9) => 9 
    case CheapFoo(10) => 10 
  } 
 
  @Benchmark 
  def matchCaseClass(i: PatternMatchState): Int = 
    ExpensiveFoo(i.matchIndex) match { 
      case ExpensiveFoo(1) => 1 
      case ExpensiveFoo(2) => 2 
      case ExpensiveFoo(3) => 3 
      case ExpensiveFoo(4) => 4 
      case ExpensiveFoo(5) => 5 
      case ExpensiveFoo(6) => 6 
      case ExpensiveFoo(7) => 7 
      case ExpensiveFoo(8) => 8 
      case ExpensiveFoo(9) => 9 
      case ExpensiveFoo(10) => 10 
    } 
} 
 
object PatternMatchingBenchmarks { 
 
  case class CheapFoo(value: Int) extends AnyVal 
  case class ExpensiveFoo(value: Int) 
 
  private val (a, b, c, d, e, f, g, h, i, j) = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) 
 
  @State(Scope.Benchmark) 
  class PatternMatchState { 
    @Param(Array("1", "5", "10")) 
    var matchIndex: Int = 0 
  } 
} 

Performance was evaluated by running 30 trials, each lasting 10 seconds with three warm-up trials, each lasting 5 seconds. Here is the benchmark invocation:

sbt 'project chapter3' 'jmh:run PatternMatchingBenchmarks -foe true'

The results are summarized in the following table:

Benchmark

Index to match

Throughput (ops per second)

Error as percentage of throughput

Throughput change as percentage of base run

matchAnyVal

1

350,568,900.12

±3.02

0

matchAnyVal

5

291,126,287.45

±2.63

-17

matchAnyVal

10

238,326,567.59

±2.95

-32

matchCaseClass

1

356,567,498.69

±3.66

0

matchCaseClass

5

287,597,483.22

±3.50

-19

matchCaseClass

10

234,989,504.60

±2.60

-34

matchIntLiterals

1

304,242,630.15

±2.95

0

matchIntLiterals

5

314,588,776.07

±3.70

3

matchIntLiterals

10

285,227,574.79

±4.33

-6

matchIntVariables

1

332,377,617.36

±3.28

0

matchIntVariables

5

263,835,356.53

±6.53

-21

matchIntVariables

10

170,460,049.63

±4.20

-49

The last column takes the first trial of each benchmark when matching the first index as the base case. For trials matching the fifth and tenth indexes, the relative performance drop is displayed. In every case, except matching the fifth index of literal integers, throughput degrades nearly linearly as deeper indexes are matched. The one trial that defies this pattern is the trial matching literal integers. In this trial, performance improves relative to the first index when accessing the fifth index. Upon inspection of the bytecode, we discover that this scenario produces a jump table instead of a set of if statements. Here is a snippet from the generated bytecode:

       6: tableswitch   { // 1 to 10 
                     1: 113 
                     2: 109 
                     3: 105 
                     4: 101 
                     5: 97 
                     6: 92 
                     7: 87 
                     8: 82 
                     9: 77 
                    10: 72 
               default: 60 
          } 

This bytecode snippet demonstrates that the JVM converts a pattern match on integer literals to a jump table using the tableswitch instruction. This is a constant time operation rather than a linear traversal of if statements. Given that the observed error is several percentage points and the observed differences across the three trials are roughly several percentage points, we can deduce that the linear access cost does not apply to this scenario. Instead, matching literal integers at the Nth index has a constant access cost due to the generated jump table. In contrast, matching an integer variable proves to be nearly twice as expensive at the tenth index. The clear takeaway from this experiment is that, for any pattern match that is generating a series of if statements, there is a linear cost to access the Nth pattern match statement. If you pattern match at least three cases in performance-sensitive code, consider reviewing the code to determine whether the statement order matches the access frequency.

Note

Do you have examples of pattern matching containing only two patterns? In scenarios involving only two pattern match statements that directly match a value, the compiler is able to generate an efficient jump table. When matching primitive literals (for example, string literals or integer literals), the compiler is able to generate jump tables for larger pattern matches. Analogous to the @tailrec annotation, Scala defines a @switch annotation for you to indicate to the compiler that you expect this pattern match statement to be compiled to a jump table. If the compiler is unable to generate a jump table, and instead it produces a series of if statements, then a warning will be issued. Like the @tailrec annotation, the compiler will apply the jump table heuristic whether you provide the @switch annotation or not. In practice, we do not often use this annotation because of its limited applicability, but it is worthwhile to be aware of its existence. The following is an example of an annotated pattern match that compiles to a jump table:

def processShareCount(sc: ShareCount): Boolean = 
 (sc: @switch) match { 
 case ShareCount(1) => true 
 case _ => false 
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset