We ended the previous chapter with a few “teaser” examples of Scala code. This chapter discusses uses of Scala that promote succinct, flexible code. We’ll discuss organization of files and packages, importing other types, variable declarations, miscellaneous syntax conventions, and a few other concepts. We’ll emphasize how the concise syntax of Scala helps you work better and faster.
Scala’s syntax is
especially useful when writing scripts. Separate compile and run steps
aren’t required for simple programs that have few dependencies on
libraries outside of what Scala provides. You compile and run such
programs in one shot with the scala
command. If you’ve
downloaded the example
code for this book, many of the smaller examples can be run using
the scala
command, e.g., scala
filename
.scala
.
See the README.txt files in each chapter’s code
examples for more details. See also Command-Line Tools
for more information about using the scala
command.
You may have already noticed that there were very few semicolons in the code examples in the previous chapter. You can use semicolons to separate statements and expressions, as in Java, C, PHP, and similar languages. In most cases, though, Scala behaves like many scripting languages in treating the end of the line as the end of a statement or an expression. When a statement or expression is too long for one line, Scala can usually infer when you are continuing on to the next line, as shown in this example:
// code-examples/TypeLessDoMore/semicolon-example-script.scala
// Trailing equals sign indicates more code on next line
def
equalsign
= {val
reallySuperLongValueNameThatGoesOnForeverSoYouNeedANewLine ="wow that was a long value name"
println(reallySuperLongValueNameThatGoesOnForeverSoYouNeedANewLine) }// Trailing opening curly brace indicates more code on next line
def
equalsign2
(s:String
) = { println("equalsign2: "
+ s) }// Trailing comma, operator, etc. indicates more code on next line
def
commas
(s1:String
, s2:String
) = { println("comma: "
+ s1 +", "
+ s2) }
When you want to put
multiple statements or expressions on the same line, you can use
semicolons to separate them. We used this technique in the
ShapeDrawingActor
example in A Taste of Concurrency:
case
"exit"
=>
println("exiting..."
); exit
This code could also be written as follows:
...case
"exit"
=>
println("exiting..."
) exit ...
You might wonder why you don’t need
curly braces ({...}
) around the two statements after
the case ... =>
line. You can put them in if you
want, but the compiler knows when you’ve reached the end of the “block”
when it finds the next case
clause or the curly brace
(}
) that ends the enclosing block for all the
case
clauses.
Omitting optional semicolons means fewer characters to type and fewer characters to clutter your code. Breaking separate statements onto their own lines increases your code’s readability.
Scala allows you to decide
whether a variable is immutable (read-only) or not (read-write) when you
declare it. An immutable “variable” is declared with the keyword
val
(think value
object):
val
array:Array[String]
=new
Array
(5
)
To be more precise, the
array
reference cannot be changed to point to a
different Array
, but the array itself can be modified,
as shown in the following scala
session:
scala>val
array:Array[String]
=new
Array
(5
) array:Array[String]
=Array
(null
,null
,null
,null
,null
) scala> array =new
Array
(2
)<console>
:5: error: reassignment to val
array = new Array(2)
^
scala> array(0) = "Hello"
scala> array
res3: Array[String] = Array(Hello, null, null, null, null)
scala>
An immutable
val
must be initialized—that is, defined—when it is
declared.
A mutable variable is
declared with the keyword var
:
scala>var
stockPrice:Double
=100.
stockPrice:Double
=100.0
scala> stockPrice =10.
stockPrice:Double
=10.0
scala>
Scala also requires you to
initialize a var
when it is declared. You can assign a
new value to a var
as often as you want. Again, to be
precise, the stockPrice
reference can be changed to
point to a different Double
object (e.g.,
10.
). In this case, the object that
stockPrice
refers to can’t be changed, because
Doubles
in Scala are immutable.
There are a few exceptions
to the rule that you must initialize val
s and
var
s when they are declared. Both keywords can be used
with constructor parameters. When used as constructor parameters, the
mutable or immutable variables specified will be initialized when an
object is instantiated. Both keywords can be used to declare “abstract”
(uninitialized) variables in abstract types. Also, derived types can
override val
s declared inside parent types. We’ll
discuss these exceptions in Chapter 5.
Scala encourages you to use immutable values whenever possible. As we will see, this promotes better object-oriented design and is consistent with the principles of “pure” functional programming. It may take some getting used to, but you’ll find a newfound confidence in your code when it is written in an immutable style.
In Chapter 1
we saw several examples of how to define methods,
which are functions that are members of a class. Method
definitions start with the def
keyword, followed by optional argument lists, a colon character
(:
) and the return type of the method, an equals sign
(=
), and finally the method body. Methods are
implicitly declared “abstract” if you leave off the
equals sign and method body. The enclosing type is then itself abstract.
We’ll discuss abstract types in more detail in Chapter 5.
We said “optional argument lists,” meaning more than one. Scala lets you define more than one argument list for a method. This is required for currying methods, which we’ll discuss in Currying. It is also very useful for defining your own Domain-Specific Languages (DSLs), as we’ll see in Chapter 11. Note that each argument list is surrounded by parentheses and the arguments are separated by commas.
If a method body has more
than one expression, you must surround it with curly braces
({...}
). You can omit the braces if the method body has
just one expression.
Many languages let you
define default values for some or all of the arguments to a method.
Consider the following script with a StringUtil
object that lets you join a list of strings with a user-specified
separator:
// code-examples/TypeLessDoMore/string-util-v1-script.scala
// Version 1 of "StringUtil".
object
StringUtil
{def
joiner
(strings:List[String]
, separator:String
):String
= strings.mkString(separator)def
joiner
(strings:List[String]
):String
= joiner(strings," "
) }import
StringUtil._// Import the joiner methods.
println( joiner(List
("Programming"
,"Scala"
)) )
There are actually two,
“overloaded” joiner
methods. The second one uses a
single space as the “default” separator. Having two methods seems a bit
wasteful. It would be nice if we could eliminate the second
joiner
method and declare that the separator
argument in the first
joiner
has a default value. In fact, in Scala version
2.8, you can now do this:
// code-examples/TypeLessDoMore/string-util-v2-v28-script.scala
// Version 2 of "StringUtil" for Scala v2.8 only.
object
StringUtil
{def
joiner
(strings:List[String]
, separator:String
=" "
):String
= strings.mkString(separator) }import
StringUtil._// Import the joiner methods.
println(joiner(List
("Programming"
,"Scala"
)))
There is another alternative for earlier versions of Scala. You can use implicit arguments, which we will discuss in Implicit Function Parameters.
Scala version 2.8 offers
another enhancement for method argument lists, named
arguments. We could actually write the last line of the
previous example in several ways. All of the following
println
statements are functionally
equivalent:
println(joiner(List
("Programming"
,"Scala"
))) println(joiner(strings =List
("Programming"
,"Scala"
))) println(joiner(List
("Programming"
,"Scala"
)," "
))// #1
println(joiner(List
("Programming"
,"Scala"
), separator =" "
))// #2
println(joiner(strings =List
("Programming"
,"Scala"
), separator =" "
))
Why is this useful? First,
if you choose good names for the method arguments, then your calls to
those methods document each argument with a name. For example, compare
the two lines with comments #1 and #2. In the first line, it may not be
obvious what the second " "
argument is for. In the
second case, we supply the name separator
, which suggests the purpose of
the argument.
The second benefit is that you can specify the parameters in any order when you specify them by name. Combined with default values, you can write code like the following:
// code-examples/TypeLessDoMore/user-profile-v28-script.scala
// Scala v2.8 only.
object
OptionalUserProfileInfo
{val
UnknownLocation =""
val
UnknownAge = -1
val
UnknownWebSite =""
}class
OptionalUserProfileInfo
( location:String
= OptionalUserProfileInfo.UnknownLocation
, age:Int
= OptionalUserProfileInfo.UnknownAge
, webSite:String
= OptionalUserProfileInfo.UnknownWebSite
) println(new
OptionalUserProfileInfo
) println(new
OptionalUserProfileInfo
(age =29
) ) println(new
OptionalUserProfileInfo
(age =29
, location="Earth"
) )
OptionalUserProfileInfo
represents all the “optional” user profile data in your next Web 2.0
social networking site. It defines default values for all its fields.
The script creates instances with zero or more named parameters. The
order of those parameters is arbitrary.
The examples we have shown use constant values as the defaults. Most languages with default argument values only allow constants or other values that can be determined at parse time. However, in Scala, any expression can be used as the default, as long as it can compile where used. For example, an expression could not refer to an instance field that will be computed inside the class or object body, but it could invoke a method on a singleton object.
A related limitation is that a default expression for one parameter can’t refer to another parameter in the list, unless the parameter that is referenced appears earlier in the list and the parameters are curried, a concept we’ll discuss in Currying.
Finally, another
constraint on named parameters is that once you provide a name for a
parameter in a method invocation, the rest of the parameters appearing
after it must also be named. For example, new OptionalUserProfileInfo(age = 29,
"Earth")
would not compile because the second argument is not
invoked by name.
We’ll see another useful example of named and default arguments when we discuss case classes in Case Classes.
Method definitions can also be nested. Here is an implementation of a factorial calculator, where we use a conventional technique of calling a second, nested method to do the work:
// code-examples/TypeLessDoMore/factorial-script.scala
def
factorial
(i:Int
):Int
= {def
fact
(i:Int
, accumulator:Int
):Int
= {if
(i <=1
) accumulatorelse
fact(i -1
, i * accumulator) } fact(i,1
) } println( factorial(0
) ) println( factorial(1
) ) println( factorial(2
) ) println( factorial(3
) ) println( factorial(4
) ) println( factorial(5
) )
The second method calls
itself recursively, passing an accumulator
parameter,
where the result of the calculation is “accumulated.” Note that we
return the accumulated value when the counter i
reaches 1. (We’re ignoring invalid negative integers. The function
actually returns 1 for i < 0
.) After the
definition of the nested method, factorial
calls it with the passed-in
value i
and the initial accumulator value of
1.
Like a local variable
declaration in many languages, a nested method is only visible inside
the enclosing method. If you try to call fact
outside
of factorial
, you will get a compiler error.
Did you notice that we use
i
as a parameter name twice, first in the
factorial
method and again in the nested
fact
method? As in many languages, the use of
i
as a parameter name for fact
“shadows” the outer use of i
as a parameter name for
factorial
. This is fine, because we don’t need the
outer value of i
inside fact
. We
only use it the first time we call fact
, at the end
of factorial
.
What if we need to use a variable that is defined outside a nested function? Consider this contrived example:
// code-examples/TypeLessDoMore/count-to-script.scala
def
countTo
(n:Int
):Unit
= {def
count
(i:Int
):Unit
= {if
(i <= n) { println(i) count(i +1
) } } count(1
) } countTo(5
)
Note that the nested
count
method uses the n
value that
is passed as a parameter to countTo
. There is no need
to pass n
as an argument to count
.
Because count
is nested inside
countTo
, n
is visible to
it.
The declaration of a field (member variable) can be prefixed with keywords indicating the visibility, just as in languages like Java and C#. Similarly the declaration of a non-nested method can be prefixed with the same keywords. We will discuss the visibility rules and keywords in Visibility Rules.
Statically typed languages can be very verbose. Consider this typical declaration in Java:
import
java.util.Map;import
java.util.HashMap; ...Map
<Integer, String>
intToStringMap = new HashMap
<Integer, String>
();
We have to specify the type
parameters <Integer, String>
twice. (Scala uses
the term type annotations for explicit type
declarations like HashMap<Integer,
String>
.)
Scala supports type inference (see, for example, [TypeInference] and [Pierce2002]). The language’s compiler can discern quite a bit of type information from the context, without explicit type annotations. Here’s the same declaration rewritten in Scala, with inferred type information:
import
java.util.Mapimport
java.util.HashMap ...val
intToStringMap:Map[Integer, String]
=new
HashMap
Recall from Chapter 1 that Scala uses square brackets
([...]
) for generic type parameters. We specify
Map[Integer, String]
on the lefthand side of the equals
sign. (We are sticking with Java types for the example.) On the righthand
side, we instantiate the actual type we want, a
HashMap
, but we don’t have to repeat the type
parameters.
For completeness, suppose we
don’t actually care if the instance is of type Map
(the
Java interface type). It can be of type HashMap
for all
we care:
import
java.util.Mapimport
java.util.HashMap ...val
intToStringMap2 =new
HashMap
[Integer, String]
This declaration requires
no type annotations on the lefthand side because all of the type
information needed is on the righthand side. The compiler automatically
makes intToStringMap2
a
HashMap[Integer,String]
.
Type inference is used for methods,
too. In most cases, the return type of the method can be inferred, so the
:
and return type can be omitted. However, type
annotations are required for all method parameters.
Pure functional languages like Haskell (see, e.g., [O’Sullivan2009]) use type inference algorithms like Hindley-Milner (see [Spiewak2008] for an easily digested explanation). Code written in these languages require type annotations less often than in Scala, because Scala’s type inference algorithm has to support object-oriented typing as well as functional typing. So, Scala requires more type annotations than languages like Haskell. Here is a summary of the rules for when explicit type annotations are required in Scala.
The Any
type is the root of the Scala type
hierarchy (see The Scala Type Hierarchy for more details).
If a block of code returns a value of type Any
unexpectedly, chances are good that the type inferencer couldn’t figure
out what type to return, so it chose the most generic type
possible.
Let’s look at examples where
explicit declarations of method return types are required. In the
following script, the upCase
method has a conditional
return statement for zero-length strings:
// code-examples/TypeLessDoMore/method-nested-return-script.scala
// ERROR: Won't compile until you put a String return type on upCase.
def
upCase
(s:String
) = {if
(s.length ==0
)return
selse
s.toUpperCase() } println( upCase(""
) ) println( upCase("Hello"
) )
Running this script gives you the following error:
... 6: error: method upCase has return statement; needs result type return s ^
You can fix this error by changing the first line of the method to the following:
def
upCase
(s:String
):String
= {
Actually, for this
particular script, an alternative fix is to remove the
return
keyword from the line. It is not needed for the
code to work properly, but it illustrates our point.
Recursive methods also
require an explicit return type. Recall our factorial
method in Nesting Method Definitions. Let’s remove the
: Int
return type on the nested fact
method:
// code-examples/TypeLessDoMore/method-recursive-return-script.scala
// ERROR: Won't compile until you put an Int return type on "fact".
def
factorial
(i:Int
) = {def
fact
(i:Int
, accumulator:Int
) = {if
(i <=1
) accumulatorelse
fact(i -1
, i * accumulator) } fact(i,1
) }
... 9: error: recursive method fact needs result type fact(i - 1, i * accumulator) ^
Overloaded methods can sometimes require an explicit return type. When one such method calls another, we have to add a return type to the one doing the calling, as in this example:
// code-examples/TypeLessDoMore/method-overloaded-return-script.scala
// Version 1 of "StringUtil" (with a compilation error).
// ERROR: Won't compile: needs a String return type on the second "joiner".
object
StringUtil
{def
joiner
(strings:List[String]
, separator:String
):String
= strings.mkString(separator)def
joiner
(strings:List[String]
) = joiner(strings," "
) }import
StringUtil._// Import the joiner methods.
println( joiner(List
("Programming"
,"Scala"
)) )
The two
joiner
methods concatenate a List
of
strings together. The first method also takes an argument for the
separator string. The second method calls the first with a “default”
separator of a single space.
If you run this script, you get the following error:
... 9: error: overloaded method joiner needs result type def joiner(strings: List[String]) = joiner(strings, "") ^
Since the
second joiner
method calls the
first, it requires an explicit String
return type. It
should look like this:
def
joiner
(strings:List[String]
):String
= joiner(strings," "
)
The final scenario can be
subtle, when a more general return type is inferred than what you
expected. You usually see this error when you assign a value returned from
a function to a variable with a more specific type. For example, you were
expecting a String
, but the function inferred an
Any
for the returned object. Let’s see a contrived
example that reflects a bug where this scenario can occur:
// code-examples/TypeLessDoMore/method-broad-inference-return-script.scala
// ERROR: Won't compile; needs a String return type on the second "joiner".
def
makeList
(strings:String
*) = {if
(strings.length ==0
)List
(0
)// #1
else
strings.toList }val
list:List[String]
= makeList()
Running this script returns the following error:
...11: error: type mismatch; found : List[Any] required: List[String] val list: List[String] = makeList() ^
We intended for
makeList
to return a List[String]
,
but when strings.length
equals zero, we returned
List(0)
, incorrectly “assuming” that this expression is
the correct way to create an empty list. In fact, we returned a
List[Int]
with one element, 0
. We
should have returned List()
. Since the
else
expression returns a
List[String]
, the result of
strings.toList
, the inferred return type for the method
is the closest common supertype of List[Int]
and
List[String]
, which is List[Any]
.
Note that the compilation error doesn’t occur in the function definition.
We only see it when we attempt to assign the value returned from
makeList
to a List[String]
variable.
In this case, fixing the bug is the solution. Alternatively, when there isn’t a bug, it may be that the compiler just needs the “help” of an explicit return type declaration. Investigate the method that appears to return the unexpected type. In our experience, you often find that you modified that method (or another one in the call path) in such a way that the compiler now infers a more general return type than necessary. Add the explicit return type in this case.
Another way to prevent these
problems is to always declare return types for methods, especially when
defining methods for a public API. Let’s revisit our
StringUtil
example and see why explicit declarations
are a good idea (adapted from [Smith2009a]).
Here is our
StringUtil
“API” again with a new method,
toCollection
:
// code-examples/TypeLessDoMore/string-util-v3.scala
// Version 3 of "StringUtil" (for all versions of Scala).
object
StringUtil
{def
joiner
(strings:List[String]
, separator:String
):String
= strings.mkString(separator)def
joiner
(strings:List[String]
):String
= strings.mkString(" "
)def
toCollection
(string:String
) = string.split(' '
) }
The
toCollection
method splits a string on spaces and
returns an Array
containing the substrings. The return
type is inferred, which is a potential problem, as we will see. The method
is somewhat contrived, but it will illustrate our point. Here is a client
of StringUtil
that uses this
method:
// code-examples/TypeLessDoMore/string-util-client.scala
import
StringUtil._object
StringUtilClient
{def
main
(args:Array[String]
) = { args foreach { s=>
toCollection(s).foreach { x=>
println(x) } } } }
If you compile these files
with scala
, you can run the client as follows:
$ scala -cp ... StringUtilClient "Programming Scala" Programming Scala
For the -cp ...
class path argument, use the
directory where scalac
wrote the class files, which
defaults to the current directory (i.e., use -cp .
). If you used the build process in
the downloaded code examples, the class files are written to the
build directory (using scalac -d build
...
). In this case, use -cp build
.
Everything is fine at this
point, but now imagine that the code base has grown. StringUtil
and its clients are now built
separately and bundled into different JARs. Imagine also that the
maintainers of StringUtil
decide to return a
List
instead of the default:
object
StringUtil
{ ...def
toCollection
(string:String
) = string.split(' '
).toList// changed!
}
The only difference is the
final call to toList
that converts the computed
Array
to a List
. You recompile
StringUtil
and redeploy its JAR. Then you run the same
client, without recompiling it first:
$ scala -cp ... StringUtilClient "Programming Scala" java.lang.NoSuchMethodError: StringUtil$.toCollection(... at StringUtilClient$$anonfun$main$1.apply(string-util-client.scala:6) at StringUtilClient$$anonfun$main$1.apply(string-util-client.scala:6) ...
What happened? When the
client was compiled, StringUtil.toCollection
returned
an Array
. Then toCollection
was
changed to return List
. In both versions, the method
return value was inferred. Therefore, the client should have been
recompiled, too.
However, had an explicit
return type of Seq
been declared, which is a parent for
both Array
and List
, then the
implementation change would not have forced a recompilation of the
client.
When developing APIs that are built separately from their clients, declare method return types explicitly and use the most general return type you can. This is especially important when APIs declare abstract methods (see, e.g., Chapter 4).
There is another scenario
to watch for when using declarations of collections like val map = Map()
, as in the following
example:
val
map =Map
() map.update("book"
,"Programming Scala"
)
... 3: error: type mismatch; found : java.lang.String("book") required: Nothing map.update("book", "Programming Scala") ^
What happened? The
type parameters of the generic type
Map
were inferred as [Nothing,Nothing]
when the map was created.
(We’ll discuss Nothing
in The Scala Type Hierarchy, but its name is suggestive!) We attempted
to insert an incompatible key-value pair of types
String
and String
. Call it a
Map
to nowhere! The solution is to parameterize the
initial map declaration, e.g., val map = Map[String,
String]()
, or to specify initial values so that the map
parameters are inferred, e.g., val map = Map("Programming" →
"Scala")
.
Finally, there is a subtle
behavior with inferred return types that can cause unexpected and baffling
results (see [ScalaTips]). Consider the following
example scala
session:
scala> def double(i: Int) { 2 * i } double: (Int)Unit scala> println(double(2)) ()
Why did the second command
print ()
instead of 4
? Look
carefully at what the scala
interpreter said the first
command returned: double (Int)Unit
. We defined a method
named double
that takes an Int
argument and returns Unit
. The method doesn’t return an
Int
as we would expect.
The cause of this unexpected behavior is a missing equals sign in the method definition. Here is the definition we actually intended:
scala> def double(i: Int) = { 2 * i } double: (Int)Int scala> println(double(2)) 4
Note the equals sign before
the body of double
. Now, the output says we have
defined double
to return an Int
and
the second command does what we expect it to do.
There is a reason for this
behavior. Scala regards a method with the equals sign before the body as a
function definition and a function always returns a value in functional
programming. On the other hand, when Scala sees a method body without the
leading equals sign, it assumes the programmer intended the method to be a
“procedure” definition, meant for
performing side effects only with the return value
Unit
. In practice, it is more likely that the
programmer simply forgot to insert the equals sign!
When the return type of a method is inferred and you don’t use an
equals sign before the opening parenthesis for the method body, Scala
infers a Unit
return type, even when the last
expression in the method is a value of another type.
By the way, where did that
()
come from that was printed before we fixed the bug?
It is actually the real name of the singleton
instance of the Unit
type! (This name is a functional
programming convention.)
Often, a new object is
initialized with a literal value, such as
val book = "Programming Scala"
. Let’s discuss the kinds
of literal values supported by Scala. Here, we’ll limit ourselves to
lexical syntax literals. We’ll cover literal syntax for functions (used as
values, not member methods), tuples, and certain
types like Lists
and Maps
as we come
to them.
Integer literals can be expressed in decimal, hexadecimal, or octal. The details are summarized in Table 2-1.
Kind | Format | Examples |
Decimal | 0 or a nonzero digit followed by zero or more digits (0–9) | 0, 1, 321 |
Hexadecimal | 0x followed by one or more hexadecimal digits (0–9, A–F, a–f) | 0xFF, 0x1a3b |
Octal | 0 followed by one or more octal digits (0–7) | 013, 077 |
For Long
literals, it is necessary to append the L
or
l
character at the end of the literal. Otherwise, an
Int
is used. The valid values for an integer literal
are bounded by the type of the variable to which the value will be
assigned. Table 2-2 defines the
limits, which are inclusive.
Target type | Minimum (inclusive) | Maximum (inclusive) |
| −263 | 263 − 1 |
| −231 | 231 − 1 |
| −215 | 215 − 1 |
| 0 | 216 − 1 |
| −27 | 27 − 1 |
A compile-time error occurs if an integer literal number is specified that is outside these ranges, as in the following examples:
scala > val i = 12345678901234567890 <console>:1: error: integer number too large val i = 12345678901234567890 scala> val b: Byte = 128 <console>:4: error: type mismatch; found : Int(128) required: Byte val b: Byte = 128 ^ scala> val b: Byte = 127 b: Byte = 127
Floating-point literals
are expressions with zero or more digits, followed by a period (.
), followed by
zero or more digits. If there are no digits before the period, i.e., the
number is less than 1.0, then there must be one or more digits after the
period. For Float
literals, append the
F
or f
character at the end of the
literal. Otherwise, a Double
is
assumed. You can optionally append a D
or
d
for a Double
.
Floating-point literals
can be expressed with or without exponentials. The format of the
exponential part is e
or E
,
followed by an optional +
or -
,
followed by one or more digits.
Here are some example floating-point literals:
0.
.0
0.0
3.
3.14
.14
0.14
3e5
3E5
3.E5
3.e5
3.e+5
3.e-5
3.14e-5
3.14e-5f
3.14e-5F
3.14e-5d
3.14e-5D
Float
consists of all IEEE 754 32-bit, single-precision binary floating-point
values. Double
consists of all IEEE 754 64-bit,
double-precision binary floating-point values.
To avoid parsing ambiguities, you must have at least one space
after a floating-point literal, if it is followed by a token that
starts with a letter. Also, the expression
1.toString
returns the integer value
1
as a string, while 1. toString
uses the operator notation to invoke
toString
on the floating-point literal
1.
.
The boolean literals are
true
and false
. The type of the
variable to which they are assigned will be inferred to be
Boolean
:
scala> val b1 = true b1: Boolean = true scala> val b2 = false b2: Boolean = false
A character literal is
either a printable Unicode character or an escape sequence, written
between single quotes. A character with a Unicode value between 0 and
255 may also be represented by an octal escape, i.e., a backslash
() followed by a sequence of up to three octal
characters. It is a compile-time error if a backslash character in a
character or string literal does not start a valid escape
sequence.
'A'
'u0041'
// 'A' in Unicode
' '
'012
'// ' ' in octal
' '
The valid escape sequences are shown in Table 2-3.
A string literal is a
sequence of characters enclosed in double quotes or
triples of double quotes, i.e.,
"""..."""
.
For string literals in
double quotes, the allowed characters are the same as the character
literals. However, if a double quote "
character
appears in the string, it must be “escaped” with a
character. Here are some examples:
"Programming
Scala"
"He exclaimed,
"
Scala is great!
"
"
"First Second"
The string literals
bounded by triples of double quotes are also called
multi-line string literals. These strings can cover
several lines; the line feeds will be part of the string. They can
include any characters, including one or two double quotes together, but
not three together. They are useful for strings with
characters that don’t form valid Unicode or escape
sequences, like the valid sequences listed in Table 2-3. Regular expressions are a
typical example, which we’ll discuss in Chapter 3. However, if escape sequences appear, they aren’t
interpreted.
Here are three example strings:
"""Programming Scala"""
"""He exclaimed, "Scala is great!" """
"""First line
Second line
Fourth line"""
Note that we had to add a space
before the trailing """
in the second example to
prevent a parse error. Trying to escape the second "
that ends the "Scala is great!"
quote, i.e.,
"Scala is great!"
, doesn’t work.
Copy and paste these
strings into the scala
interpreter. Do the same for
the previous string examples. How are they interpreted
differently?
Scala supports symbols, which are interned strings, meaning that two symbols with the same “name” (i.e., the same character sequence) will actually refer to the same object in memory. Symbols are used less often in Scala than in some other languages, like Ruby, Smalltalk, and Lisp. They are useful as map keys instead of strings.
A symbol literal is a
single quote ('
), followed by a letter, followed by
zero or more digits and letters. Note that an expression like
'1
is invalid, because the compiler thinks it is an
incomplete character literal.
A symbol literal
'
id
is a shorthand for the
expression scala.Symbol("id")
.
How many times have you wanted to return two or more values from a method? In many languages, like Java, you only have a few options, none of which is very appealing. You could pass in parameters to the method that will be modified for all or some of the “return” values, which is ugly. Or you could declare some small “structural” class that holds the two or more values, then return an instance of that class.
Scala, supports
tuples, a grouping of two or more items, usually
created with the literal syntax of a comma-separated list of the items
inside parentheses, e.g., (x1, x2, ...)
. The types of
the x
i elements are unrelated to
each other; you can mix and match types. These literal “groupings” are
instantiated as scala.TupleN
instances, where
N
is the number of items in the tuple. The Scala API
defines separate TupleN
classes for
N
between 1 and 22, inclusive. Tuple instances are
immutable, first-class values, so you can assign them
to variables, pass them as values, and return them from methods.
The following example demonstrates the use of tuples:
// code-examples/TypeLessDoMore/tuple-example-script.scala
def
tupleator
(x1:Any
, x2:Any
, x3:Any
) = (x1, x2, x3)val
t = tupleator("Hello"
,1
,2.3
) println("Print the whole tuple: "
+ t ) println("Print the first item: "
+ t._1 ) println("Print the second item: "
+ t._2 ) println("Print the third item: "
+ t._3 )val
(t1, t2, t3) = tupleator("World"
,'!'
,0x22
) println( t1 +" "
+ t2 +" "
+ t3 )
Running this script with scala
produces the following output:
Print the whole tuple: (Hello,1,2.3) Print the first item: Hello Print the second item: 1 Print the third item: 2.3 World ! 34
The tupleator
method simply
returns a “3-tuple” with the input arguments. The first statement that
uses this method assigns the returned tuple to a single variable
t
. The next four statements print t
in various ways. The first print statement calls
Tuple3.toString
, which wraps parentheses around the
item list. The following three statements print each item in
t
separately. The expression t._N
retrieves the N
item, starting at 1,
not 0 (this choice follows functional programming
conventions).
The last two lines show that we can use
a tuple expression on the lefthand side of the assignment. We declare
three val
s—t1
,
t2
, and t3
—to hold the individual
items in the tuple. In essence, the tuple items are extracted
automatically.
Notice how we mixed types
in the tuples. You can see the types more clearly if you use the
interactive mode of the scala
command, which we
introduced in Chapter 1.
Invoke the
scala
command with no script argument. At the
scala>
prompt, enter val t = ("Hello",1,2.3)
and see that you
get the following result, which shows you the type of each element in the
tuple:
scala> val t = ("Hello",1,2.3) t: (java.lang.String, Int, Double) = (Hello,1,2.3)
It’s worth noting that there’s more than one way to define a tuple. We’ve been using the more common parenthesized syntax, but you can also use the arrow operator between two values, as well as special factory methods on the tuple-related classes:
scala> 1 -> 2 res0: (Int, Int) = (1,2) scala> Tuple2(1, 2) res1: (Int, Int) = (1,2) scala> Pair(1, 2) res2: (Int, Int) = (1,2)
We’ll discuss the standard
type hierarchy for Scala in The Scala Type Hierarchy.
However, three useful classes to understand now are the
Option
class and its two subclasses,
Some
and None
.
Most languages have a
special keyword or object that’s assigned to reference variables when
there’s nothing else for them to refer to. In Java, this is
null
; in Ruby, it’s nil
. In Java,
null
is a keyword, not an object, and thus it’s illegal
to call any methods on it. But this is a confusing choice on the language
designer’s part. Why return a keyword when the programmer expects an
object?
To be more consistent with
the goal of making everything an object, as well as to conform with
functional programming conventions, Scala encourages you to use the
Option
type for variables and function return values
when they may or may not refer to a value. When there is no value, use
None
, an object
that is a subclass
of Option
. When there is a value, use
Some
, which wraps the value. Some
is
also a subclass of Option
.
None
is declared as an
object
, not a class
, because we
really only need one instance of it. In that sense, it’s like the
null
keyword, but it is a real object with
methods.
You can see
Option
, Some
, and
None
in action in the following example, where we
create a map of state capitals in the United States:
// code-examples/TypeLessDoMore/state-capitals-subset-script.scala
val
stateCapitals =Map
("Alabama"
->"Montgomery"
,"Alaska"
->"Juneau"
,// ...
"Wyoming"
->"Cheyenne"
) println("Get the capitals wrapped in Options:"
) println("Alabama: "
+ stateCapitals.get("Alabama"
) ) println("Wyoming: "
+ stateCapitals.get("Wyoming"
) ) println("Unknown: "
+ stateCapitals.get("Unknown"
) ) println("Get the capitals themselves out of the Options:"
) println("Alabama: "
+ stateCapitals.get("Alabama"
).get ) println("Wyoming: "
+ stateCapitals.get("Wyoming"
).getOrElse("Oops!"
) ) println("Unknown: "
+ stateCapitals.get("Unknown"
).getOrElse("Oops2!"
) )
The convenient
->
syntax for defining name-value pairs to
initialize a Map
will be discussed in The Predef Object. For now, we want to focus on the two groups of
println
statements, where we show what happens when you
retrieve the values from the map. If you run this script with the
scala
command, you’ll get the following output:
Get the capitals wrapped in Options: Alabama: Some(Montgomery) Wyoming: Some(Cheyenne) Unknown: None Get the capitals themselves out of the Options: Alabama: Montgomery Wyoming: Cheyenne Unknown: Oops2!
The first group of
println
statements invoke toString
implicitly on the instances returned by get
. We are
calling toString
on Some
or
None
instances because the values returned by
Map.get
are automatically wrapped in a
Some
, when there is a value in the map for the
specified key. Note that the Scala library doesn’t store the
Some
in the map; it wraps the value in a
Some
upon retrieval. Conversely, when we ask for a map
entry that doesn’t exist, the None
object is returned,
rather than null
. This occurred in the last
println
of the three.
The second group of
println
statements goes a step further. After calling
Map.get
, they call get
or
getOrElse
on each Option
instance to
retrieve the value it contains. Option.get
requires
that the Option
is not empty—that is, the
Option
instance must actually be a
Some
. In this case, get
returns the
value wrapped by the Some
, as demonstrated in the
println
where we print the capital of Alabama. However,
if the Option
is
actually None
, then
None.get
throws a
NoSuchElementException
.
We also show the
alternative method, getOrElse
, in the last two
println
statements. This method returns either the
value in the Option
, if it is a Some
instance, or it returns the second argument we passed to
getOrElse
, if it is a None
instance.
In other words, the second argument to getOrElse
functions as the default return value.
So,
getOrElse
is the more defensive of the two methods. It
avoids a potential thrown exception. We’ll discuss the merits of
alternatives like get
versus
getOrElse
in Exceptions and the Alternatives.
Note that because the
Map.get
method returns an Option
, it
automatically documents the fact that there may not be an item matching
the specified key. The map handles this situation by returning a
None
. Most languages would return
null
(or the equivalent) when there is no “real” value
to return. You learn from experience to expect a possible
null
. Using Option
makes the
behavior more explicit in the method signature, so it’s more
self-documenting.
Also, thanks to Scala’s
static typing, you can’t make the mistake of attempting to call a method
on a value that might actually be null
. While this
mistake is easy to do in Java, it won’t compile in Scala because you must
first extract the value from the Option
. So, the use of
Option
strongly encourages more resilient
programming.
Because Scala runs on the
JVM and .NET and because it must interoperate with other libraries, Scala
has to support null
. Still, you should avoid using
null
in your code. Tony Hoare, who invented the null
reference in 1965 while working on an object-oriented language called
ALGOL W, called its invention his “billion dollar mistake” (see [Hoare2009]). Don’t
contribute to that figure.
So, how would you write a
method that returns an Option
? Here is a possible
implementation of get
that could be used by a concrete
subclass of Map
(Map.get
itself is
abstract). For a more sophisticated version, see the
implementation of get
in
scala.collection.immutable.HashMap
in the Scala
library source code distribution:
def
get
(key:A
):Option[B]
= {if
(contains(key))new
Some
(getValue(key))else
None
}
The
contains
method is also defined for
Map
. It returns true
if the map
contains a value for the specified key. The getValue
method is intended to be an internal method that retrieves the value from
the underlying storage, whatever it is.
Note how the value returned
by getValue
is wrapped in a Some[B]
,
where the type B
is inferred. However, if the call to
contains(key)
returns false
, then
the object None
is returned.
You can use this same idiom
when your methods return an Option
. We’ll explore other
uses for Option
in subsequent sections. Its pervasive
use in Scala code makes it an important concept to grasp.
Scala adopts the package concept that Java uses for namespaces, but Scala offers a more flexible syntax. Just as file names don’t have to match the type names, the package structure does not have to match the directory structure. So, you can define packages in files independent of their “physical” location.
The following example
defines a class MyClass
in a package
com.example.mypkg
using the conventional Java
syntax:
// code-examples/TypeLessDoMore/package-example1.scala
package
com.example.mypkgclass
MyClass
{// ...
}
The next example shows a
contrived example that defines packages using the nested package syntax in
Scala, which is similar to the namespace
syntax in C#
and the use of modules
as namespaces in Ruby:
// code-examples/TypeLessDoMore/package-example2.scala
package
com {package
example {package
pkg1 {class
Class11
{def
m
="m11"
}class
Class12
{def
m
="m12"
} }package
pkg2 {class
Class21
{def
m
="m21"
def
makeClass11
= {new
pkg1.Class11
}def
makeClass12
= {new
pkg1.Class12
} } }package
pkg3.pkg31.pkg311 {class
Class311
{def
m
="m21"
} } } }
Two packages,
pkg1
and pkg2
, are defined under the
com.example
package. A total of three classes are defined between the two
packages. The makeClass11
and makeClass12
methods in
Class21
illustrate how to reference a type in the
“sibling” package,
pkg1
. You can also reference these classes by
their full paths, com.example.pkg1.Class11
and
com.example.pkg1.Class12
, respectively.
The package
pkg3.pkg31.pkg311
shows that you can “chain” several
packages together in one clause. It is not necessary to use a separate
package
clause for each package.
Following the conventions
of Java, the root package for Scala’s library classes is named
scala
.
Scala does not allow package declarations in scripts that are
executed directly with the scala
interpreter. The
reason has to do with the way the interpreter converts statements in
scripts to valid Scala code before
compiling to byte code. See The scala Command-Line Tool for more details.
To use declarations in packages, you have to import them, just as you do in Java and similarly for other languages. However, compared to Java, Scala greatly expands your options. The following example illustrates several ways to import Java types:
// code-examples/TypeLessDoMore/import-example1.scala
import
java.awt._import
java.io.Fileimport
java.io.File._import
java.util.{Map, HashMap}
You can import all types in
a package, using the underscore (_
) as a wildcard, as
shown on the first line. You can also import individual Scala or Java
types, as shown on the second line.
Java uses the “star” character
(*
) as the wildcard for matching all types in a package
or all static members of a type when doing “static imports.” In Scala,
this character is allowed in method names, so _
is used
as a wildcard, as we saw previously.
As shown on the third line,
you can import all the static methods and fields in Java types. If
java.io.File
were actually a Scala
object
, as discussed previously, then this line would
import the fields and methods from the object.
Finally, you can selectively
import just the types you care about. On the fourth line, we import just
the java.util.Map
and
java.util.HashMap
types from the
java.util
package. Compare this one-line import
statement with the two-line import statements we used in our first example
in Inferring Type Information. They are functionally equivalent.
The next example shows more advanced options for import statements:
// code-examples/TypeLessDoMore/import-example2-script.scala
def
writeAboutBigInteger
() = {import
java.math.BigInteger.{ ONE => _, TEN, ZERO => JAVAZERO }// ONE is effectively undefined
// println( "ONE: "+ONE )
println("TEN: "
+TEN
) println("ZERO: "
+JAVAZERO
) } writeAboutBigInteger()
This example demonstrates
two features. First, we can put import statements almost anywhere we want,
not just at the top of the file, as required by Java. This feature allows
us to scope the imports more narrowly. For example, we can’t reference the
imported BigInteger
definitions outside the scope of
the method. Another advantage of this feature is that it puts an import
statement closer to where the imported items are actually used.
The second feature shown is
the ability to rename imported items. First, the
java.math.BigInteger.ONE
constant is renamed to the
underscore wildcard. This effectively makes it invisible and unavailable
to the importing scope. This is a useful technique when you want to import
everything except a few particular items.
Next, the
java.math.BigInteger.TEN
constant is imported without
renaming, so it can be referenced simply as TEN
.
Finally, the
java.math.BigInteger.ZERO
constant is given the “alias”
JAVAZERO
.
Aliasing is useful if you want to give the item a more convenient name or you want to avoid ambiguities with other items in scope that have the same name.
There’s one other important thing to know about imports: they are relative. Note the comments for the following imports:
// code-examples/TypeLessDoMore/relative-imports.scala
import
scala.collection.mutable._import
collection.immutable._// Since "scala" is already imported
import
_root_.scala.collection.jcl._// full path from real "root"
package
scala.actors {import
remote._// We're in the scope of "scala.actors"
}
Note that the last import
statement nested in the scala.actor
package scope is
relative to that scope.
The [ScalaWiki] has other examples at http://scala.sygneca.com/faqs/language#how-do-i-import.
It’s fairly rare that
you’ll have problems with relative imports, but the problem with this
convention is that they sometimes cause surprises, especially if you are
accustomed to languages like Java, where imports are absolute. If you
get a mystifying compiler error that a package wasn’t found, check that
the statement is properly relative to the last import statement or add
the _root_.
prefix. Also, you might see an IDE or
other tool insert an import _root_...
statement in
your code. Now you know what it means.
We mentioned in A Taste of Scala that Scala supports parameterized
types, which are very similar to generics
in Java. (We could use the two terms interchangeably, but it’s more common
to use “parameterized types” in the Scala community and “generics” in the
Java community.) The most obvious difference is in the syntax, where Scala
uses square brackets ([...]
), while Java uses angle
brackets (<...>
).
For example, a list of strings would be declared as follows:
val
languages:List[String]
= ...
There are other important differences with Java’s generics, which we’ll explore in Understanding Parameterized Types.
For now, we’ll mention one
other useful detail that you’ll encounter before we can explain it in
depth in Chapter 12. If you look at the
declaration of scala.List
in the Scaladocs, you’ll see
that the declaration is written as ... class List[+A]
.
The +
in front of the A
means that
List[B]
is a subtype of
List[A]
for any B
that is a subtype
of A
. If there is a -
in front of a
type parameter, then the relationship goes the other way;
Foo[B]
would be a supertype of
Foo[A]
, if the declaration is
Foo[-A]
.
Scala supports another type abstraction mechanism called abstract types, used in many functional programming languages, such as Haskell. Abstract types were also considered for inclusion in Java when generics were adopted. We want to introduce them now because you’ll see many examples of them before we dive into their details in Chapter 12. For a very detailed comparison of these two mechanisms, see [Bruce1998].
Abstract types can be applied to many of the same design problems for which parameterized types are used. However, while the two mechanisms overlap, they are not redundant. Each has strengths and weaknesses for certain design problems.
Here is an example that uses an abstract type:
// code-examples/TypeLessDoMore/abstract-types-script.scala
import
java.io._abstract
class
BulkReader
{type
In
val
source:In
def
read
:String
}class
StringBulkReader
(val
source:String
)extends
BulkReader
{type
In
=String
def
read
= source }class
FileBulkReader
(val
source:File
)extends
BulkReader
{type
In
=File
def
read
= {val
in =new
BufferedInputStream
(new
FileInputStream
(source))val
numBytes = in.available()val
bytes =new
Array
[Byte]
(numBytes) in.read(bytes,0
, numBytes)new
String
(bytes) } } println(new
StringBulkReader
("Hello Scala!"
).read ) println(new
FileBulkReader
(new
File
("abstract-types-script.scala"
)).read )
Running this script with
scala
produces the following output:
Hello Scala! import java.io._ abstract class BulkReader { ...
The
BulkReader
abstract class declares
three abstract members: a type
named
In
, a val
field
source
, and a read
method. As in
Java, instances in Scala can only be created from
concrete classes, which must have definitions for all
members.
The derived classes,
StringBulkReader
and FileBulkReader
,
provide concrete definitions for these abstract members. We’ll cover the
details of class declarations in Chapter 5 and the particulars of
overriding member declarations in Overriding Members of Classes and Traits in
Chapter 6.
For now, note that the
type
field works very much like a type parameter in a
parameterized type. In fact, we could rewrite this example as follows,
where we show only what would be different:
abstract
class
BulkReader
[In]
{val
source:In
... }class
StringBulkReader
(val
source:String
)extends
BulkReader
[String]
{...}class
FileBulkReader
(val
source:File
)extends
BulkReader
[File]
{...}
Just as for parameterized
types, if we define the In
type to be
String
, then the source
field must
also be defined as a String
. Note that the
StringBulkReader
’s read
method
simply returns the source
field, while the
FileBulkReader
’s read
method reads
the contents of the file.
As demonstrated by [Bruce1998], parameterized types tend to be best for collections, which is how they are most often used in Java code, whereas abstract types are most useful for type “families” and other type scenarios.
We’ll explore the details of Scala’s abstract types in Chapter 12. For example, we’ll see how to constrain the possible concrete types that can be used.
Table 2-4 lists the reserved words in Scala, which we sometimes call “keywords,” and briefly describes how they are used (see [ScalaSpec2009]).
Word | Description | See … |
| Makes a declaration abstract. Unlike Java, the keyword is usually not required for abstract members. | |
| Start a case clause in a match expression. | |
| Start a clause for catching thrown exceptions. | |
| Start a class declaration. | |
| Start a method declaration. | |
| Start a | |
| Start an | |
| Indicates that the class or trait that follows is the parent type of the class or trait being declared. | |
|
| |
| Applied to a class or trait to prohibit deriving child types from it. Applied to a member to prohibit overriding it in a derived class or trait. | |
| Start a clause that is executed after the
corresponding | |
| Start a | |
| Used in existential type declarations to constrain the allowed concrete types that can be used. | |
| Start an | |
| Marks a method as eligible to be used as an implicit type converter. Marks a method parameter as optional, as long as a type-compatible substitute object is in the scope where the method is called. | |
| Import one or more types or members of types into the current scope. | |
| Defer evaluation of a
| |
| Start a pattern matching clause. | |
| Create a new instance of a class. | |
| Value of a reference variable that has not been assigned a value. | |
| Start a singleton declaration: a
| |
| Override a concrete member of a
class or trait, as long as the original is not marked
| |
| Start a package scope declaration. | |
| Restrict visibility of a declaration. | |
| Restrict visibility of a declaration. | |
| Deprecated. Was used for self typing. | |
| Return from a function. | |
| Applied to a parent class to require all directly derived classes to be declared in the same source file. | |
| Analogous to | |
| How an object refers to itself. The method name for auxiliary constructors. | |
| Throw an exception. | |
| A mixin module that adds additional state and behavior to an instance of a class. | |
| Start a block that may throw an exception. | |
|
| |
| Start a type declaration. | |
| Start a read-only “variable” declaration. | |
| Start a read-write variable declaration. | |
| Start a | |
| Include the trait that follows in the class being declared or the object being instantiated. | |
| Return an element in a | |
| A placeholder, used in imports, function literals, etc. | Many |
| Separator between identifiers and type annotations. | |
| Assignment. | |
| Used in function literals to separate the argument list from the function body. | |
| Used in | |
| Used in parameterized and abstract type declarations to constrain the allowed types. | |
| Used in parameterized and abstract type “view bounds” declarations. | |
| Used in parameterized and abstract type declarations to constrain the allowed types. | |
| Used in type projections. | |
| Marks an annotation. | |
| (Unicode u21D2) Same as =>. | |
| (Unicode u2190) Same as <-. |
Notice that break
and continue
are not listed. These control keywords
don’t exist in Scala. Instead, Scala encourages you to use functional
programming idioms that are usually more succinct and less error-prone.
We’ll discuss alternative approaches when we discuss
for
loops (see Generator Expressions).
Some Java methods use names
that are reserved by Scala, for example,
java.util.Scanner.match
. To avoid a compilation error,
surround the name with single back quotes, e.g.,
java.util.Scanner.‵match‵
.