Before we dive into Scala’s support for object-oriented and functional programming, let’s finish our discussion of the essential features you’ll use in most of your programs.
An important fundamental concept in Scala is that all operators are actually methods. Consider this most basic of examples:
// code-examples/Rounding/one-plus-two-script.scala
1
+2
That plus sign between the
numbers? It’s a method. First, Scala allows non-alphanumeric method names. You can call
methods +
, -
, $
,
or whatever you desire. Second, this expression is identical to 1
.+(2)
. (We put a space after the 1
because
1.
would be interpreted as a
Double
.) When a method takes one argument, Scala lets
you drop both the period and the parentheses, so the method invocation
looks like an operator invocation. This is called “infix” notation, where
the operator is between the instance and the argument. We’ll find out more
about this shortly.
Similarly, a method with no arguments can be invoked without the period. This is called “postfix” notation.
Ruby and Smalltalk programmers should now feel right at home. As users of those languages know, these simple rules have far-reaching benefits when it comes to creating programs that flow naturally and elegantly.
So, what characters can you
use in identifiers? Here is a summary of the rules for identifiers, used
for method and type names, variables, etc. For the precise details, see
[ScalaSpec2009]. Scala allows all the
printable ASCII characters, such as letters, digits, the underscore
(_
), and the dollar sign ($
), with
the exceptions of the “parenthetical” characters—(
,
)
, [
, ]
,
{
, and }
—and the “delimiter”
characters—`
, ’
,
'
, "
, .
,
;
, and ,
. Scala allows the other
characters between u0020–u007F that are not in the sets just shown, such
as mathematical symbols and “other” symbols. These remaining characters
are called operator characters, and they include
characters such as /
, <
,
etc.
As in most languages, you can’t reuse reserved words for
identifiers. We listed the reserved words in Reserved Words. Recall that some of them are
combinations of operator and punctuation characters. For example, a
single underscore (_
) is a reserved
word!
$
, _
, and operatorsLike Java and many languages, a plain
identifier can begin with a letter or underscore,
followed by more letters, digits, underscores, and dollar signs.
Unicode-equivalent characters are also allowed. However, like Java,
Scala reserves the dollar sign for internal use, so you shouldn’t
use it in your own identifiers. After an underscore, you can have either
letters and digits or a sequence of operator
characters. The underscore is important. It tells the compiler to
treat all the characters up to the next whitespace as part of the
identifier. For example, val xyz_++= = 1
assigns
the variable xyz_++=
the value
1
, while the expression val xyz++= =
1
won’t compile because the “identifier” could also be
interpreted as xyz ++=
, which looks like an
attempt to append something to xyz
. Similarly, if
you have operator characters after the underscore, you can’t mix
them with letters and digits. This restriction prevents ambiguous
expressions like this: abc_=123
. Is that an
identifier abc_=123
or an assignment of the value
123
to abc_
?
If an identifier begins with an operator character, the rest of the characters must be operator characters.
An identifier can also be an arbitrary string (subject to
platform limitations) between two back quote characters, e.g.,
val `this is a valid identifier` = "Hello
World!"
. Recall that this syntax is also the way to invoke
a method on a Java or .NET class when the method’s name is identical
to a Scala reserved word, e.g.,
java.net.Proxy.‵type‵()
.
In pattern matching expressions, tokens that begin with a
lowercase letter are parsed as variable
identifiers, while tokens that begin with an uppercase
letter are parsed as constant identifiers. This
restriction prevents some ambiguities because of the very succinct
variable syntax that is used, e.g., no val
keyword is present.
Once you know that all
operators are methods, it’s easier to reason about unfamiliar Scala
code. You don’t have to worry about special cases when you see new
operators. When working with Actors in A Taste of Concurrency, you may have noticed that we used an
exclamation point (!
) to send a message to an Actor.
Now you know that the !
is just another method, as
are the other handy shortcut operators you can use to talk to Actors.
Similarly, Scala’s XML library provides the and
\
operators to dive into document structures. These
are just methods on the scala.xml.NodeSeq
class.
This flexible method naming gives you the power to write libraries that feel like a natural extension of Scala itself. You could write a new math library with numeric types that accept all the usual mathematical operators, like addition and subtraction. You could write a new concurrent messaging layer that behaves just like Actors. The possibilities are constrained only by Scala’s method naming limitations.
Just because you can doesn’t mean you should. When designing your own libraries and APIs in Scala, keep in mind that obscure punctuational operators are hard for programmers to remember. Overuse of these can contribute a “line noise” quality of unreadability to your code. Stick to conventions and err on the side of spelling method names out when a shortcut doesn’t come readily to mind.
To facilitate a variety of
readable programming styles, Scala is flexible about the use of
parentheses in methods. If a method takes no parameters, you can define it
without parentheses. Callers must invoke the method without parentheses.
If you add empty parentheses, then callers may optionally add parentheses.
For example, the size
method for
List
has no parentheses, so you write List(1,
2, 3).size
. If you try List(1, 2, 3).size()
,
you’ll get an error. However, the length
method for
java.lang.String
does have
parentheses in its definition, but Scala lets you write both
"hello".length()
and
"hello".length
.
The convention in the Scala community is to omit parentheses when calling a method that has no side effects. So, asking for the size of a sequence is fine without parentheses, but defining a method that transforms the elements in the sequence should be written with parentheses. This convention signals a potentially tricky method for users of your code.
It’s also possible to omit
the dot (period) when calling a parameterless method or one that takes
only one argument. With this in mind, our List(1, 2,
3).size
example could be written as:
// code-examples/Rounding/no-dot-script.scala
List
(1
,2
,3
) size
Neat, but confusing. When does this syntactical flexibility become useful? When chaining method calls together into expressive, self-explanatory “sentences” of code:
// code-examples/Rounding/no-dot-better-script.scala
def
isEven
(n:Int
) = (n %2
) ==0
List
(1
,2
,3
,4
) filter isEven foreach println
As you might guess, running this produces the following output:
2 4
Scala’s liberal approach to parentheses and dots on methods provides one building block for writing Domain-Specific Languages. We’ll learn more about them after a brief discussion of operator precedence.
So, if an expression like
2.0 * 4.0 / 3.0 * 5.0
is actually a series of method
calls on Double
s, what are the operator
precedence rules? Here they are in order from lowest to
highest precedence (see [ScalaSpec2009]):
All letters
|
^
&
< >
= !
:
+ -
* / %
All other special characters
Characters on the same
line have the same precedence. An exception is =
when
used for assignment, when it has the lowest precedence.
Since * and / have the same
precedence, the two lines in the following scala
session behave the same:
scala>2.0
*4.0
/3.0
*5.0
res2:Double
=13.333333333333332
scala> (((2.0
*4.0
) /3.0
) *5.0
) res3:Double
=13.333333333333332
In a sequence of
left-associative method invocations, they simply bind in left-to-right
order. “Left-associative” you say? In Scala, any method with a name that
ends with a colon :
actually binds to the
right, while all other methods bind to the left.
For example, you can prepend an element to a List
using the ::
method (called “cons,” short for
“constructor”):
scala>val
list =List
('b'
,'c'
,'d'
) list:List[Char]
=List
(b, c, d) scala>'a'
:: list res4:List[Char]
=List
(a, b, c, d)
The second expression is
equivalent to list.::(
.
In a sequence of right-associative method invocations, they bind from
right to left. What about a mixture of left-binding and right-binding
expressions?a
)
scala>'a'
:: list ++List
('e'
,'f'
) res5:List[Char]
=List
(a, b, c, d, e, f)
(The ++
method appends two lists.) In this
case, list
is added to the
List(
, then
e
,
f
)
is prepended to create
the final list. It’s usually better to add parentheses to remove any
potential uncertainty.a
Finally, note that when you use the scala
command, either interactively or with scripts, it may appear that you
can define “global” variables and methods outside of types. This is
actually an illusion; the interpreter wraps all definitions in an
anonymous type before generating JVM or .NET CLR byte code.
Domain-Specific Languages, or DSLs, provide a convenient syntactical means for expressing goals in a given problem domain. For example, SQL provides just enough of a programming language to handle the problems of working with databases, making it a Domain-Specific Language.
While some DSLs like SQL are self-contained, it’s become popular to implement DSLs as subsets of full-fledged programming languages. This allows programmers to leverage the entirety of the host language for edge cases that the DSL does not cover, and saves the work of writing lexers, parsers, and the other building blocks of a language.
Scala’s rich, flexible syntax makes writing DSLs a breeze. Consider this example of a style of test writing called Behavior-Driven Development (see [BDD]) using the Specs library (see Specs):
// code-examples/Rounding/specs-script.scala
"nerd finder"
should {"identify nerds from a List"
in {val
actors =List
("Rick Moranis"
,"James Dean"
,"Woody Allen"
)val
finder =new
NerdFinder
(actors) finder.findNerds mustEqualList
("Rick Moranis"
,"Woody Allen"
) } }
Notice how much this code reads like English: “This should test that in the following scenario,” “This value must equal that value,” and so forth. This example uses the superb Specs library, which effectively provides a DSL for the Behavior-Driven Development testing and engineering methodology. By making maximum use of Scala’s liberal syntax and rich methods, Specs test suites are readable even by non-developers.
This is just a taste of the power of DSLs in Scala. We’ll see other examples later and learn how to write our own as we get more advanced (see Chapter 11).
Even the most familiar
language features are supercharged in Scala. Let’s have a look at the
lowly if
statement. As in most every language, Scala’s
if
evaluates a conditional expression, then proceeds to
a block if the result is true
, or branches to an
alternate block if the result is false
. A simple
example:
// code-examples/Rounding/if-script.scala
if
(2
+2
==5
) { println("Hello from 1984."
) }else
if
(2
+2
==3
) { println("Hello from Remedial Math class?"
) }else
{ println("Hello from a non-Orwellian future."
) }
What’s different in Scala
is that if
and almost all other statements are actually
expressions themselves. So, we can assign the result of an
if
expression, as shown here:
// code-examples/Rounding/assigned-if-script.scala
val
configFile =new
java.io.File
("~/.myapprc"
)val
configFilePath =if
(configFile.exists()) { configFile.getAbsolutePath() }else
{ configFile.createNewFile() configFile.getAbsolutePath() }
Note that
if
statements are expressions, meaning they have
values. In this example, the value configFilePath
is
the result of an if
expression that handles the case of
a configuration file not existing internally, then returns the absolute
path to that file. This value can now be reused throughout an application,
and the if
expression won’t be reevaluated when the
value is used.
Because if
statements are expressions in Scala, there is no need for the special-case
ternary conditional expressions that exist in C-derived languages. You
won’t see x ? doThis() : doThat()
in Scala. Scala
provides a mechanism that’s just as powerful and more readable.
What if we omit the
else
clause in the previous example? Typing the code in
the scala
interpreter will tell us what happens:
scala>val
configFile =new
java.io.File
("~/.myapprc"
) configFile:java.io.File
= ~/.myapprc scala>val
configFilePath =if
(configFile.exists()) { | configFile.getAbsolutePath() | } configFilePath:Unit
= () scala>
Note that
configFilePath
is now Unit
. (It was
String
before.) The type inference picks a type that
works for all outcomes of the if
expression.
Unit
is the only possibility, since no value is one
possible outcome.
Another familiar control
structure that’s particularly feature-rich in Scala is the
for
loop, referred to in the Scala community as a
for
comprehension or
for
expression. This corner of the
language deserves at least one fancy name, because it can do some great
party tricks.
Actually, the term
comprehension
comes from functional programming. It
expresses the idea that we are traversing a set of some kind,
“comprehending” what we find, and computing something new from
it.
Let’s start with a basic
for
expression:
// code-examples/Rounding/basic-for-script.scala
val
dogBreeds =List
("Doberman"
,"Yorkshire Terrier"
,"Dachshund"
,"Scottish Terrier"
,"Great Dane"
,"Portuguese Water Dog"
)for
(breed<-
dogBreeds) println(breed)
As you might guess, this
code says, “For every element in the list dogBreeds
,
create a temporary variable called breed
with the
value of that element, then print it.” Think of the
<-
operator as an arrow directing elements of a
collection, one by one, to the scoped variable by which we’ll refer to
them inside the for
expression. The left-arrow
operator is called a generator, so named because
it’s generating individual values from a collection
for use in an expression.
What if we want to get
more granular? Scala’s for
expressions allow for
filters that let us specify which elements of a
collection we want to work with. So to find all terriers in our list of
dog breeds, we could modify the previous example to the
following:
// code-examples/Rounding/filtered-for-script.scala
for
(breed<-
dogBreedsif
breed.contains("Terrier"
) ) println(breed)
To add more than one
filter to a for
expression, separate the filters with
semicolons:
// code-examples/Rounding/double-filtered-for-script.scala
for
(breed<-
dogBreedsif
breed.contains("Terrier"
);if
!breed.startsWith("Yorkshire"
) ) println(breed)
You’ve now found all the terriers that don’t hail from Yorkshire, and hopefully learned just how useful filters can be in the process.
What if, rather than
printing your filtered collection, you needed to hand it off to another part of your program? The
yield
keyword is your ticket to generating new
collections with
for
expressions. In the following example, note that
we’re wrapping up the for
expression in curly braces,
as we would when defining any block:
// code-examples/Rounding/yielding-for-script.scala
val
filteredBreeds =for
{ breed<-
dogBreedsif
breed.contains("Terrier"
)if
!breed.startsWith("Yorkshire"
) }yield
breed
for
expressions may be defined with
parentheses or curly braces, but using curly braces means you don’t
have to separate your filters with semicolons. Most of the time,
you’ll prefer using curly braces when you have more than one filter,
assignment, etc.
Every time through the
for
expression, the filtered result is yielded as a
value named breed
. These results accumulate with
every run, and the resulting collection is assigned to the value
filteredBreeds
(as we did with if
statements earlier). The type of the collection resulting from a
for-yield
expression is inferred from the type of the
collection being iterated over. In this case,
filteredBreeds
is of type
List[String]
, since it is a subset of the
dogBreeds
list, which is also of type
List[String]
.
One final useful feature
of Scala’s for
comprehensions is the ability to
define variables inside the first part of your for
expressions that can be used in the latter part. This is best
illustrated with an example:
// code-examples/Rounding/scoped-for-script.scala
for
{ breed<-
dogBreeds upcasedBreed = breed.toUpperCase() } println(upcasedBreed)
Note that without
declaring upcasedBreed
as a val
,
you can reuse it within the body of your for
expression. This approach is ideal for transforming elements in a
collection as you loop through them.
Finally, in Options and for Comprehensions, we’ll see how using
Options
with for
comprehensions
can greatly reduce code size by eliminating unnecessary “null” and
“missing” checks.
Scala provides several other looping constructs.
Familiar in many
languages, the while
loop executes a block of code as
long as a condition is true. For example, the following code prints out
a complaint once a day until the next Friday the 13th has
arrived:
// code-examples/Rounding/while-script.scala
// WARNING: This script runs for a LOOOONG time!
import
java.util.Calendardef
isFridayThirteen
(cal:Calendar
):Boolean
= {val
dayOfWeek = cal.get(Calendar.DAY_OF_WEEK
)val
dayOfMonth = cal.get(Calendar.DAY_OF_MONTH
)// Scala returns the result of the last expression in a method
(dayOfWeek == Calendar.FRIDAY
) && (dayOfMonth ==13
) }while
(!isFridayThirteen(Calendar.getInstance())) { println("Today isn't Friday the 13th. Lame."
)// sleep for a day
Thread.sleep(86400000
) }
Table 3-1 later in this chapter shows the
conditional operators that work in while
loops.
Like the
while
loop, a do-while
loop
executes some code while a conditional expression is true. The only
difference that a do-while
checks to see if the
condition is true after running the block. To count
up to 10, we could write this:
// code-examples/Rounding/do-while-script.scala
var
count =0
do
{ count +=1
println(count) }while
(count <10
)
As it turns out, there’s a more elegant way to loop through collections in Scala, as we’ll see in the next section.
Remember the arrow operator
(<-
) from the discussion about
for
loops? We can put it to work here, too. Let’s
clean up the do-while
example just shown:
// code-examples/Rounding/generator-script.scala
for
(i<-
1
to10
) println(i)
Yup, that’s all that’s necessary. This clean one-liner is possible
because of Scala’s RichInt
class. An
implicit conversion is invoked by the compiler to
convert the 1
, an Int
, into a
RichInt
. (We’ll discuss these conversions in The Scala Type Hierarchy and in Implicit Conversions.) RichInt
defines a
to
method that takes another integer and returns an
instance of Range.Inclusive
. That is,
Inclusive
is a nested class in the
Range
companion object (a
concept we introduced briefly in Chapter 1;
see Chapter 6 for details).
This subclass of the class Range
inherits a number of methods for working with sequences and iterable
data structures, including those necessary to use it in a
for
loop.
By the way, if you wanted
to count from 1 up to but not including 10, you could use
until
instead of to
. For example:
for (i <- 0 until 10)
.
This should paint a clearer picture of how Scala’s internal libraries compose to form easy-to-use language constructs.
When working with loops in most languages, you can
break
out of a loop or continue
the iterations. Scala doesn’t have either of these statements, but
when writing idiomatic Scala code, they’re not necessary. Use
conditional expressions to test if a loop should continue, or make use
of recursion. Better yet, filter your collections ahead of time to
eliminate complex conditions within your loops. However, because of
demand for it, Scala version 2.8 includes support for
break
, implemented as a library method, rather than
a built-in break
keyword.
Scala borrows most of the
conditional operators from Java and its predecessors. You’ll find the ones
listed in Table 3-1 in
if
statements, while
loops, and
everywhere else conditions apply.
Operator | Operation | Description |
| and | The values on the left and right of the operator are true. The righthand side is only evaluated if the lefthand side is true. |
| or | At least one of the values on the left or right is true. The righthand side is only evaluated if the lefthand side is false. |
| greater than | The value on the left is greater than the value on the right. |
>= | greater than or equals | The value on the left is greater than or equal to the value on the right. |
| less than | The value on the left is less than the value on the right. |
<= | less than or equals | The value on the left is less than or equal to the value on the right. |
| equals | The value on the left is the same as the value on the right. |
| not equal | The value on the left is not the same as the value on the right. |
Note that &&
and ||
are “short-circuiting” operators. They stop
evaluating expressions as soon as the answer is known.
We’ll discuss object
equality in more detail in Equality of Objects. For
example, we’ll see that ==
has a different meaning in
Scala versus Java. Otherwise, these operators should all be familiar, so
let’s move on to something new and exciting.
An idea borrowed from
functional languages, pattern matching is a powerful
yet concise way to make a programmatic choice between multiple conditions.
Pattern matching is the familiar case
statement from
your favorite C-like language, but on steroids. In the typical
case
statement you’re limited to matching against
values of ordinal types, yielding trivial expressions like this: “In the
case that i
is 5, print a message; in the case that
i
is 6, exit the program.” With Scala’s pattern
matching, your cases can include types, wildcards, sequences, regular
expressions, and even deep inspections of an object’s variables.
To begin with, let’s simulate flipping a coin by matching the value of a boolean:
// code-examples/Rounding/match-boolean-script.scala
val
bools =List
(true
,false
)for
(bool<-
bools) { boolmatch
{case
true
=>
println("heads"
)case
false
=>
println("tails"
)case
_
=>
println("something other than heads or tails (yikes!)"
) } }
It looks just like a
C-style case
statement, right? The only difference is
the last case
with the underscore
(_
) wildcard. It matches anything not defined in
the cases above it, so it serves the same purpose as the
default
keyword in Java and C#
switch
statements.
Pattern matching is
eager; the first match wins. So, if you try to put
a case _
clause before any other
case
clauses, the compiler will throw an “unreachable
code” error on the next clause, because nothing will get past the
default clause!
In the following example,
we assign the wildcard case to a variable called
otherNumber
, then print it in the subsequent
expression. If we generate a 7, we’ll extol that number’s virtues.
Otherwise, we’ll curse fate for making us suffer an unlucky
number:
// code-examples/Rounding/match-variable-script.scala
import
scala.util.Randomval
randomInt =new
Random
().nextInt(10
) randomIntmatch
{case
7
=>
println("lucky seven!"
)case
otherNumber=>
println("boo, got boring ol' "
+ otherNumber) }
These simple examples don’t even begin to scratch the surface of Scala’s pattern matching features. Let’s try matching based on type:
// code-examples/Rounding/match-type-script.scala
val
sundries =List
(23
,"Hello"
,8.5
,'q'
)for
(sundry<-
sundries) { sundrymatch
{case
i:Int => println
("got an Integer: "
+ i)case
s:String => println
("got a String: "
+ s)case
f:Double => println
("got a Double: "
+ f)case
other=>
println("got something else: "
+ other) } }
Here we pull each element
out of a List
of Any
type of
element, in this case containing a String
, a
Double
, an Int
, and a
Char
. For the first three of those types, we let the
user know specifically which type we got and what the value was. When we
get something else (the Char
), we just let the user
know the value. We could add further elements to the list of other types
and they’d be caught by the other
wildcard
case.
Since working in Scala
often means working with sequences, wouldn’t it be handy to be able to
match against the length and contents of lists and arrays? The following
example does just that, testing two lists to see if they contain four
elements, the second of which is the integer
3
:
// code-examples/Rounding/match-seq-script.scala
val
willWork =List
(1
,3
,23
,90
)val
willNotWork =List
(4
,18
,52
)val
empty =List
()for
(l<-
List
(willWork, willNotWork, empty)) { lmatch
{case
List
(_
,3
,_
,_
)=>
println("Four elements, with the 2nd being '3'."
)case
List
(_
*)=>
println("Any other list with 0 or more elements."
) } }
In the second
case
we’ve used a special wildcard pattern to match a
List
of any size, even zero elements, and any element
values. You can use this pattern at the end of any sequence match to
remove length as a condition.
Recall that we mentioned
the “cons” method for List
, ::
.
The expression a :: list
prepends
a
to a list. You can also use this operator to
extract the head and tail of a list:
// code-examples/Rounding/match-list-script.scala
val
willWork =List
(1
,3
,23
,90
)val
willNotWork =List
(4
,18
,52
)val
empty =List
()def
processList
(l:List[Any]
):Unit
= lmatch
{case
head :: tail=>
format("%s "
, head) processList(tail)case
Nil
=>
println(""
) }for
(l<-
List
(willWork, willNotWork, empty)) { print("List: "
) processList(l) }
The
processList
method matches on the
List
argument l. It may look strange to start the
method definition like the following:
def
processList
(l:List[Any]
):Unit
= lmatch
{ ... }
Hopefully hiding the
details with the ellipsis makes the meaning a little clearer. The
processList
method is actually one statement that
crosses several lines.
It first matches on
head :: tail
, where head
will be
assigned the first element in the list and tail
will
be assigned the rest of the list. That is, we’re extracting the head and
tail from the list using ::
. When this case matches,
it prints the head
and calls
processList
recursively to process the tail.
The second case matches
the empty list, Nil
. It prints an end of line and
terminates the recursion.
Alternately, if we just wanted to test that we have a tuple of two items, we could do a tuple match:
// code-examples/Rounding/match-tuple-script.scala
val
tupA = ("Good"
,"Morning!"
)val
tupB = ("Guten"
,"Tag!"
)for
(tup<-
List
(tupA, tupB)) { tupmatch
{case
(thingOne, thingTwo)if
thingOne =="Good"
=>
println("A two-tuple starting with 'Good'."
)case
(thingOne, thingTwo)=>
println("This has two things: "
+ thingOne +" and "
+ thingTwo) } }
In the second
case
in this example, we’ve extracted the values
inside the tuple to scoped variables, then reused these variables in the
resulting expression.
In the first case we’ve
added a new concept: guards. The
if
condition after the tuple is a guard. The guard is
evaluated when matching, but only extracting any variables in the
preceding part of the case. Guards provide additional granularity when
constructing cases. In this example, the only difference between the two
patterns is the guard expression, but that’s enough for the compiler to
differentiate them.
Recall that the cases in a pattern match are evaluated in
order. For example, if your first case is broader than your second
case, the second case will never be reached. (Unreachable cases will
cause a compiler error.) You may include a “default” case at the end
of a pattern match, either using the underscore wildcard character or
a meaningfully named variable. When using a variable, it should have
no explicit type or it should be declared as Any
,
so it can match anything. On the other hand, try to design your code
to avoid a catch-all clause by ensuring it only receives specific
items that are expected.
Let’s try a deep match, examining the contents of objects in our pattern match:
// code-examples/Rounding/match-deep-script.scala
case
class
Person
(name:String
, age:Int
)val
alice =new
Person
("Alice"
,25
)val
bob =new
Person
("Bob"
,32
)val
charlie =new
Person
("Charlie"
,32
)for
(person<-
List
(alice, bob, charlie)) { personmatch
{case
Person
("Alice"
,25
)=>
println("Hi Alice!"
)case
Person
("Bob"
,32
)=>
println("Hi Bob!"
)case
Person
(name, age)=>
println("Who are you, "
+ age +" year-old person named "
+ name +"?"
) } }
Poor Charlie gets the cold shoulder, as we can see in the output:
Hi Alice! Hi Bob! Who are you, 32 year-old person named Charlie?
We first define a
case class, a special type of class that we’ll
learn more about in Case Classes. For now, it will
suffice to say that a case class allows for very terse construction of
simple objects with some predefined methods. Our pattern match then looks for Alice and Bob
by inspecting the values passed to the constructor of the
Person
case class. Charlie falls through to the
catch-all case; even though he has the same age
value
as Bob, we’re matching on the name
property as
well.
This type of pattern match becomes extremely useful when working with Actors, as we’ll see later on. Case classes are frequently sent to Actors as messages, and deep pattern matching on an object’s contents is a convenient way to “parse” those messages.
Regular expressions are convenient for extracting data from strings that have an informal structure, but are not “structured data” (that is, in a format like XML or JSON, for example). Commonly referred to as regexes, regular expressions are a feature of nearly all modern programming languages. They provide a terse syntax for specifying complex matches, one that is typically translated into a state machine behind the scenes for optimum performance.
Regexes in Scala should contain no surprises if you’ve used them in other programming languages. Let’s see an example:
// code-examples/Rounding/match-regex-script.scala
val
BookExtractorRE ="""Book: title=([^,]+),s+authors=(.+)"""
.rval
MagazineExtractorRE ="""Magazine: title=([^,]+),s+issue=(.+)"""
.rval
catalog =List
("Book: title=Programming Scala, authors=Dean Wampler, Alex Payne"
,"Magazine: title=The New Yorker, issue=January 2009"
,"Book: title=War and Peace, authors=Leo Tolstoy"
,"Magazine: title=The Atlantic, issue=February 2009"
,"BadData: text=Who put this here??"
)for
(item<-
catalog) { itemmatch
{case
BookExtractorRE
(title, authors)=>
println("Book
"
"
+ title +"
"
, written by "
+ authors)case
MagazineExtractorRE
(title, issue)=>
println("Magazine
"
"
+ title +"
"
, issue "
+ issue)case
entry=>
println("Unrecognized entry: "
+ entry) } }
We start with two regular
expressions, one for records of books and another for records of
magazines. Calling .r
on a string turns it into a
regular expression; we use raw (triple-quoted) strings here to avoid
having to double-escape backslashes. Should you find the
.r
transformation method on strings unclear, you can
also define regexes by creating new instances of the
Regex
class, as in: new
Regex("""W""")
.
Notice that each of our
regexes defines two capture groups, connoted by
parentheses. Each group captures the value of a single field in the
record, such as a book’s title or author. Regexes in Scala translate
those capture groups to extractors. Every match
sets a field to the captured result; every miss is set to
null
.
What does this mean in
practice? If the text fed to the regular expression matches, case BookExtractorRE(title, authors)
will
assign the first capture group to title
and the
second to authors
. We can then use those values on
the righthand side of the case
clause, as we have in
the previous example. The variable names title
and
author
within the extractor are arbitrary; matches
from capture groups are simply assigned from left to right, and you can
call them whatever you’d like.
That’s regexes in Scala
in nutshell. The scala.util.matching.Regex
class
supplies several handy methods for finding and replacing matches in
strings, both all occurrences of a match and just the first occurrence,
so be sure to make use of them.
What we won’t cover in
this section is the details of writing regular expressions. Scala’s
Regex
class uses the underlying platform’s regular
expression APIs (that is, Java’s or .NET’s). Consult references on those
APIs for the hairy details, as they may be subtly different from the
regex support in your language of choice.
Sometimes you want to
bind a variable to an object enclosed in a match, where you are also
specifying match criteria on the nested object. Suppose we modify a
previous example so we’re matching on the key-value pairs from a map.
We’ll store our same Person
objects as the values and
use an employee ID as the key. We’ll also add another attribute to
Person
, a role
field that points
to an instance from a type hierarchy:
// code-examples/Rounding/match-deep-pair-script.scala
class
Role
case
object
Manager
extends
Role
case
object
Developer
extends
Role
case
class
Person
(name:String
, age:Int
, role:Role
)val
alice =new
Person
("Alice"
,25
,Developer
)val
bob =new
Person
("Bob"
,32
,Manager
)val
charlie =new
Person
("Charlie"
,32
,Developer
)for
(item<-
Map
(1
-> alice,2
-> bob,3
-> charlie)) { itemmatch
{case
(id, p @Person
(_
,_
,Manager
))=>
format("%s is overpaid.
"
, p)case
(id, p @Person
(_
,_
,_
))=>
format("%s is underpaid.
"
, p) } }
The case
objects
are just singleton objects like we’ve seen before, but
with the special case
behavior. We’re most interested
in the embedded p @ Person(...)
inside the case
clause. We’re matching on particular kinds of Person
objects inside the enclosing tuple. We also want to assign the
Person
to a variable p
, so we can
use it for printing:
Person(Alice,25,Developer) is underpaid. Person(Bob,32,Manager) is overpaid. Person(Charlie,32,Developer) is underpaid.
If we weren’t using
matching criteria in Person
itself, we could just
write p: Person
. For example, the previous
match
clause could be written this way:
itemmatch
{case
(id, p:Person
)=>
p.rolematch
{case
Manager
=>
format("%s is overpaid.
"
, p)case
_
=>
format("%s is underpaid.
"
, p) } }
Note that the p @
Person(...)
syntax gives us a way to flatten this nesting of
match statements into one statement. It is analogous to using “capture
groups” in a regular expression to pull out substrings we want, instead
of splitting the string in several successive steps to extract the
substrings we want. Use whichever technique you prefer.
Through its use of functional constructs and strong typing, Scala encourages a coding style that lessens the need for exceptions and exception handling. But where Scala interacts with Java, exceptions are still prevalent.
Scala does not have checked exceptions, like Java. Even Java’s
checked exceptions are treated as unchecked by Scala. There is also no
throws
clause on method declarations. However,
there is a @throws
annotation that is useful for
Java interoperability. See the section Annotations.
Thankfully, Scala treats exception handling as just another pattern match, allowing us to make smart choices when presented with a multiplicity of potential exceptions. Let’s see this in action:
// code-examples/Rounding/try-catch-script.scala
import
java.util.Calendarval
then =null
val
now = Calendar.getInstance()try
{ now.compareTo(then) }catch
{case
e:NullPointerException => println
("One was null!"
); System.exit(-1
)case
unknown=>
println("Unknown exception "
+ unknown); System.exit(-1
) }finally
{ println("It all worked out."
) System.exit(0
) }
In this example, we
explicitly catch the NullPointerException
thrown when
trying to compare a Calendar
instance with
null
. We also define unknown
as a
catch-all case, just to be safe. If we weren’t hardcoding this program
to fail, the finally
block would be reached and the
user would be informed that everything worked out just fine.
You can use an underscore (Scala’s standard wildcard character) as a placeholder to catch any type of exception (really, to match any case in a pattern matching expression). However, you won’t be able to refer to the exception in the subsequent expression. Name the exception variable if you need it; for example, if you need to print the exception as we do in the catch-all case of the previous example.
Pattern matching aside,
Scala’s treatment of exception handling should be familiar to those
fluent in Java, Ruby, Python, and most other mainstream languages. And
yes, you throw an exception by writing throw new
MyBadException(...)
. That’s all there is to it.
Pattern matching is a powerful and elegant way of extracting information from objects, when used appropriately. Recall from Chapter 1 that we highlighted the synergy between pattern matching and polymorphism. Most of the time, you want to avoid the problems of “switch” statements that know a class hierarchy, because they have to be modified every time the hierarchy is changed.
In our drawing Actor
example, we used pattern matching to separate different “categories” of
messages, but we used polymorphism to draw the shapes sent to it. We
could change the Shape
hierarchy and the Actor code
would not require changes.
Pattern matching is also useful for the design problem where you need to get at data inside an object, but only in special circumstances. One of the unintended consequences of the JavaBeans (see [JavaBeansSpec]) specification was that it encouraged people to expose fields in their objects through getters and setters. This should never be a default decision. Access to “state information” should be encapsulated and exposed only in ways that make logical sense for the type, as viewed from the abstraction it exposes.
Instead, consider using
pattern matching for those “rare” times when you need to extract
information in a controlled way. As we will see in Unapply, the pattern matching examples we have shown
use unapply
methods defined to extract information
from instances. These methods let you extract that information while
hiding the implementation details. In fact, the information returned by
unapply
might be a transformation of the actual
information in the type.
Finally, when designing pattern matching statements, be wary of relying on a default case clause. Under what circumstances would “none of the above” be the correct answer? It may indicate that the design should be refined so you know more precisely all the possible matches that might occur. We’ll learn one technique that helps when we discuss sealed class hierarchies in Sealed Class Hierarchies.
Remember our examples
involving various breeds of dog? In thinking about the types in these
programs, we might want a top-level Breed
type that
keeps track of a number of breeds. Such a type is called an
enumerated type, and the values it contains are
called enumerations.
While enumerations are a
built-in part of many programming languages, Scala takes a different route
and implements them as a class in its standard library. This means there
is no special syntax for enumerations in Scala, as in Java and C#.
Instead, you just define an object that extends the
Enumeration
class. Hence, at the byte code level, there
is no connection between Scala enumerations and the
enum
constructs in Java and C#.
// code-examples/Rounding/enumeration-script.scala
object
Breed
extends
Enumeration
{val
doberman =Value
("Doberman Pinscher"
)val
yorkie =Value
("Yorkshire Terrier"
)val
scottie =Value
("Scottish Terrier"
)val
dane =Value
("Great Dane"
)val
portie =Value
("Portuguese Water Dog"
) }// print a list of breeds and their IDs
println("ID Breed"
)for
(breed<-
Breed
) println(breed.id +" "
+ breed)// print a list of Terrier breeds
println("
Just Terriers:"
) Breed.filter(_
.toString.endsWith("Terrier"
)).foreach(println)
When run, you’ll get the following output:
ID Breed 0 Doberman Pinscher 1 Yorkshire Terrier 2 Scottish Terrier 3 Great Dane 4 Portuguese Water Dog Just Terriers: Yorkshire Terrier Scottish Terrier
We can see that our
Breed
enumerated type contains several variables of
type Value
, as in the following example:
val doberman = Value("Doberman Pinscher")
Each
declaration is actually calling a method named Value
that takes a string argument. We use this method to assign a long-form
breed name to each enumeration value, which is what the
Value.toString
method returned in the output.
Note that there is no
namespace collision between the type and method that both have the name
Value
. There are other overloaded versions of the
Value
method. One of them takes no arguments, another
takes an Int
ID value, and another takes both an
Int
and String
. These
Value
methods return a Value
object,
and they add the value to the enumeration’s collection of values.
In fact, Scala’s
Enumeration
class supports the usual methods for
working with collections, so we can easily iterate through the breeds with
a for
loop and filter
them by name.
The output above also demonstrated that every Value
in
an enumeration is automatically assigned a numeric identifier, unless you
call one of the Value
methods where you specify your
own ID value explicitly.
You’ll often want to give your
enumeration values human-readable names, as we did here. However,
sometimes you may not need them. Here’s another enumeration example
adapted from the Scaladoc entry for Enumeration
:
// code-examples/Rounding/days-enumeration-script.scala
object
WeekDay
extends
Enumeration
{type
WeekDay
=Value
val
Mon,Tue
,Wed
,Thu
,Fri
,Sat
,Sun
=Value
}import
WeekDay._def
isWorkingDay
(d:WeekDay
) = ! (d ==Sat
|| d ==Sun
)WeekDay
filter isWorkingDay foreach println
Running this script with
scala
yields the following output:
Main$$anon$1$WeekDay(0) Main$$anon$1$WeekDay(1) Main$$anon$1$WeekDay(2) Main$$anon$1$WeekDay(3) Main$$anon$1$WeekDay(4)
When a name isn’t assigned
using one of the Value
methods that takes a
String
argument, Value.toString
prints the name of the type that is synthesized by the compiler, along
with the ID value that was generated automatically.
Note that we imported
WeekDay._
. This made each enumeration value
(Mon
, Tues
, etc.) in scope.
Otherwise, you would have to write WeekDay.Mon
,
WeekDay.Tues
, etc.
Also, the import made the
type alias, type Weekday = Value
,
in scope, which we used as the type for the argument for the
isWorkingDay
method. If you don’t define a type alias
like this, then you would declare the method as def
isWorkingDay(d: WeekDay.Value)
.
Since Scala enumerations
are just regular objects, you could use any object with
vals
to indicate different “enumeration values.”
However, extending Enumeration
has several advantages.
It automatically manages the values as a collection that you can iterate
over, etc., as in our examples. It also automatically assigns unique
integer IDs to each value.
Case classes (see Case Classes) are often used instead of enumerations in Scala because the “use case” for them often involves pattern matching. We’ll revisit this topic in Enumerations Versus Pattern Matching.
We’ve covered a lot of
ground in this chapter. We learned how flexible Scala’s syntax can be, and
how it facilitates the creation of Domain-Specific Languages. Then we
explored Scala’s enhancements to looping constructs and conditional
expressions. We experimented with different uses for pattern matching, a
powerful improvement on the familiar case-switch
statement. Finally, we learned how to encapsulate values in enumerations.
You should now be prepared to read a fair bit of Scala code, but there’s plenty more about the language to put in your tool belt. In the next four chapters, we’ll explore Scala’s approach to object-oriented programming, starting with traits.