Chapter 2. Type Less, Do More

In This Chapter

We ended the previous chapter with a few “teaser” examples of Scala code. This chapter discusses uses of Scala that promote succinct, flexible code. We’ll discuss organization of files and packages, importing other types, variable declarations, miscellaneous syntax conventions, and a few other concepts. We’ll emphasize how the concise syntax of Scala helps you work better and faster.

Scala’s syntax is especially useful when writing scripts. Separate compile and run steps aren’t required for simple programs that have few dependencies on libraries outside of what Scala provides. You compile and run such programs in one shot with the scala command. If you’ve downloaded the example code for this book, many of the smaller examples can be run using the scala command, e.g., scala filename.scala. See the README.txt files in each chapter’s code examples for more details. See also Command-Line Tools for more information about using the scala command.

Semicolons

You may have already noticed that there were very few semicolons in the code examples in the previous chapter. You can use semicolons to separate statements and expressions, as in Java, C, PHP, and similar languages. In most cases, though, Scala behaves like many scripting languages in treating the end of the line as the end of a statement or an expression. When a statement or expression is too long for one line, Scala can usually infer when you are continuing on to the next line, as shown in this example:

// code-examples/TypeLessDoMore/semicolon-example-script.scala

// Trailing equals sign indicates more code on next line
def equalsign = {
  val reallySuperLongValueNameThatGoesOnForeverSoYouNeedANewLine =
    "wow that was a long value name"

  println(reallySuperLongValueNameThatGoesOnForeverSoYouNeedANewLine)
}

// Trailing opening curly brace indicates more code on next line
def equalsign2(s: String) = {
  println("equalsign2: " + s)
}

// Trailing comma, operator, etc. indicates more code on next line
def commas(s1: String,
           s2: String) = {
  println("comma: " + s1 +
          ", " + s2)
}

When you want to put multiple statements or expressions on the same line, you can use semicolons to separate them. We used this technique in the ShapeDrawingActor example in A Taste of Concurrency:

case "exit" => println("exiting..."); exit

This code could also be written as follows:

...
case "exit" =>
      println("exiting...")
      exit
...

You might wonder why you don’t need curly braces ({...}) around the two statements after the case ... => line. You can put them in if you want, but the compiler knows when you’ve reached the end of the “block” when it finds the next case clause or the curly brace (}) that ends the enclosing block for all the case clauses.

Omitting optional semicolons means fewer characters to type and fewer characters to clutter your code. Breaking separate statements onto their own lines increases your code’s readability.

Variable Declarations

Scala allows you to decide whether a variable is immutable (read-only) or not (read-write) when you declare it. An immutable “variable” is declared with the keyword val (think value object):

val array: Array[String] = new Array(5)

To be more precise, the array reference cannot be changed to point to a different Array, but the array itself can be modified, as shown in the following scala session:

scala> val array: Array[String] = new Array(5)
array: Array[String] = Array(null, null, null, null, null)

scala> array = new Array(2)
<console>:5: error: reassignment to val
       array = new Array(2)
             ^

scala> array(0) = "Hello"

scala> array
res3: Array[String] = Array(Hello, null, null, null, null)

scala>

An immutable val must be initialized—that is, defined—when it is declared.

A mutable variable is declared with the keyword var:

scala> var stockPrice: Double = 100.
stockPrice: Double = 100.0

scala> stockPrice = 10.
stockPrice: Double = 10.0

scala>

Scala also requires you to initialize a var when it is declared. You can assign a new value to a var as often as you want. Again, to be precise, the stockPrice reference can be changed to point to a different Double object (e.g., 10.). In this case, the object that stockPrice refers to can’t be changed, because Doubles in Scala are immutable.

There are a few exceptions to the rule that you must initialize vals and vars when they are declared. Both keywords can be used with constructor parameters. When used as constructor parameters, the mutable or immutable variables specified will be initialized when an object is instantiated. Both keywords can be used to declare “abstract” (uninitialized) variables in abstract types. Also, derived types can override vals declared inside parent types. We’ll discuss these exceptions in Chapter 5.

Scala encourages you to use immutable values whenever possible. As we will see, this promotes better object-oriented design and is consistent with the principles of “pure” functional programming. It may take some getting used to, but you’ll find a newfound confidence in your code when it is written in an immutable style.

Note

The var and val keywords only specify whether the reference can be changed to refer to a different object (var) or not (val). They don’t specify whether or not the object they reference is mutable.

Method Declarations

In Chapter 1 we saw several examples of how to define methods, which are functions that are members of a class. Method definitions start with the def keyword, followed by optional argument lists, a colon character (:) and the return type of the method, an equals sign (=), and finally the method body. Methods are implicitly declared “abstract” if you leave off the equals sign and method body. The enclosing type is then itself abstract. We’ll discuss abstract types in more detail in Chapter 5.

We said “optional argument lists,” meaning more than one. Scala lets you define more than one argument list for a method. This is required for currying methods, which we’ll discuss in Currying. It is also very useful for defining your own Domain-Specific Languages (DSLs), as we’ll see in Chapter 11. Note that each argument list is surrounded by parentheses and the arguments are separated by commas.

If a method body has more than one expression, you must surround it with curly braces ({...}). You can omit the braces if the method body has just one expression.

Method Default and Named Arguments (Scala Version 2.8)

Many languages let you define default values for some or all of the arguments to a method. Consider the following script with a StringUtil object that lets you join a list of strings with a user-specified separator:

// code-examples/TypeLessDoMore/string-util-v1-script.scala
// Version 1 of "StringUtil".

object StringUtil {
  def joiner(strings: List[String], separator: String): String =
    strings.mkString(separator)

  def joiner(strings: List[String]): String = joiner(strings, " ")
}
import StringUtil._    // Import the joiner methods.

println( joiner(List("Programming", "Scala")) )

There are actually two, “overloaded” joiner methods. The second one uses a single space as the “default” separator. Having two methods seems a bit wasteful. It would be nice if we could eliminate the second joiner method and declare that the separator argument in the first joiner has a default value. In fact, in Scala version 2.8, you can now do this:

// code-examples/TypeLessDoMore/string-util-v2-v28-script.scala
// Version 2 of "StringUtil" for Scala v2.8 only.

object StringUtil {
  def joiner(strings: List[String], separator: String = " "): String =
    strings.mkString(separator)
}
import StringUtil._    // Import the joiner methods.

println(joiner(List("Programming", "Scala")))

There is another alternative for earlier versions of Scala. You can use implicit arguments, which we will discuss in Implicit Function Parameters.

Scala version 2.8 offers another enhancement for method argument lists, named arguments. We could actually write the last line of the previous example in several ways. All of the following println statements are functionally equivalent:

println(joiner(List("Programming", "Scala")))
println(joiner(strings = List("Programming", "Scala")))
println(joiner(List("Programming", "Scala"), " "))   // #1
println(joiner(List("Programming", "Scala"), separator = " ")) // #2
println(joiner(strings = List("Programming", "Scala"), separator = " "))

Why is this useful? First, if you choose good names for the method arguments, then your calls to those methods document each argument with a name. For example, compare the two lines with comments #1 and #2. In the first line, it may not be obvious what the second " " argument is for. In the second case, we supply the name separator, which suggests the purpose of the argument.

The second benefit is that you can specify the parameters in any order when you specify them by name. Combined with default values, you can write code like the following:

// code-examples/TypeLessDoMore/user-profile-v28-script.scala
// Scala v2.8 only.

object OptionalUserProfileInfo {
  val UnknownLocation = ""
  val UnknownAge = -1
  val UnknownWebSite = ""
}

class OptionalUserProfileInfo(
  location: String = OptionalUserProfileInfo.UnknownLocation,
  age: Int         = OptionalUserProfileInfo.UnknownAge,
  webSite: String  = OptionalUserProfileInfo.UnknownWebSite)

println( new OptionalUserProfileInfo )
println( new OptionalUserProfileInfo(age = 29) )
println( new OptionalUserProfileInfo(age = 29, location="Earth") )

OptionalUserProfileInfo represents all the “optional” user profile data in your next Web 2.0 social networking site. It defines default values for all its fields. The script creates instances with zero or more named parameters. The order of those parameters is arbitrary.

The examples we have shown use constant values as the defaults. Most languages with default argument values only allow constants or other values that can be determined at parse time. However, in Scala, any expression can be used as the default, as long as it can compile where used. For example, an expression could not refer to an instance field that will be computed inside the class or object body, but it could invoke a method on a singleton object.

A related limitation is that a default expression for one parameter can’t refer to another parameter in the list, unless the parameter that is referenced appears earlier in the list and the parameters are curried, a concept we’ll discuss in Currying.

Finally, another constraint on named parameters is that once you provide a name for a parameter in a method invocation, the rest of the parameters appearing after it must also be named. For example, new OptionalUserProfileInfo(age = 29, "Earth") would not compile because the second argument is not invoked by name.

We’ll see another useful example of named and default arguments when we discuss case classes in Case Classes.

Nesting Method Definitions

Method definitions can also be nested. Here is an implementation of a factorial calculator, where we use a conventional technique of calling a second, nested method to do the work:

// code-examples/TypeLessDoMore/factorial-script.scala

def factorial(i: Int): Int = {
  def fact(i: Int, accumulator: Int): Int = {
    if (i <= 1)
      accumulator
    else
      fact(i - 1, i * accumulator)
  }

  fact(i, 1)
}

println( factorial(0) )
println( factorial(1) )
println( factorial(2) )
println( factorial(3) )
println( factorial(4) )
println( factorial(5) )

The second method calls itself recursively, passing an accumulator parameter, where the result of the calculation is “accumulated.” Note that we return the accumulated value when the counter i reaches 1. (We’re ignoring invalid negative integers. The function actually returns 1 for i < 0.) After the definition of the nested method, factorial calls it with the passed-in value i and the initial accumulator value of 1.

Like a local variable declaration in many languages, a nested method is only visible inside the enclosing method. If you try to call fact outside of factorial, you will get a compiler error.

Did you notice that we use i as a parameter name twice, first in the factorial method and again in the nested fact method? As in many languages, the use of i as a parameter name for fact “shadows” the outer use of i as a parameter name for factorial. This is fine, because we don’t need the outer value of i inside fact. We only use it the first time we call fact, at the end of factorial.

What if we need to use a variable that is defined outside a nested function? Consider this contrived example:

// code-examples/TypeLessDoMore/count-to-script.scala

def countTo(n: Int):Unit = {
  def count(i: Int): Unit = {
    if (i <= n) {
      println(i)
      count(i + 1)
    }
  }
  count(1)
}

countTo(5)

Note that the nested count method uses the n value that is passed as a parameter to countTo. There is no need to pass n as an argument to count. Because count is nested inside countTo, n is visible to it.

The declaration of a field (member variable) can be prefixed with keywords indicating the visibility, just as in languages like Java and C#. Similarly the declaration of a non-nested method can be prefixed with the same keywords. We will discuss the visibility rules and keywords in Visibility Rules.

Inferring Type Information

Statically typed languages can be very verbose. Consider this typical declaration in Java:

import java.util.Map;
import java.util.HashMap;
...
Map<Integer, String> intToStringMap = new HashMap<Integer, String>();

We have to specify the type parameters <Integer, String> twice. (Scala uses the term type annotations for explicit type declarations like HashMap<Integer, String>.)

Scala supports type inference (see, for example, [TypeInference] and [Pierce2002]). The language’s compiler can discern quite a bit of type information from the context, without explicit type annotations. Here’s the same declaration rewritten in Scala, with inferred type information:

import java.util.Map
import java.util.HashMap
...
val intToStringMap: Map[Integer, String] = new HashMap

Recall from Chapter 1 that Scala uses square brackets ([...]) for generic type parameters. We specify Map[Integer, String] on the lefthand side of the equals sign. (We are sticking with Java types for the example.) On the righthand side, we instantiate the actual type we want, a HashMap, but we don’t have to repeat the type parameters.

For completeness, suppose we don’t actually care if the instance is of type Map (the Java interface type). It can be of type HashMap for all we care:

import java.util.Map
import java.util.HashMap
...
val intToStringMap2 = new HashMap[Integer, String]

This declaration requires no type annotations on the lefthand side because all of the type information needed is on the righthand side. The compiler automatically makes intToStringMap2 a HashMap[Integer,String].

Type inference is used for methods, too. In most cases, the return type of the method can be inferred, so the : and return type can be omitted. However, type annotations are required for all method parameters.

Pure functional languages like Haskell (see, e.g., [O’Sullivan2009]) use type inference algorithms like Hindley-Milner (see [Spiewak2008] for an easily digested explanation). Code written in these languages require type annotations less often than in Scala, because Scala’s type inference algorithm has to support object-oriented typing as well as functional typing. So, Scala requires more type annotations than languages like Haskell. Here is a summary of the rules for when explicit type annotations are required in Scala.

Note

The Any type is the root of the Scala type hierarchy (see The Scala Type Hierarchy for more details). If a block of code returns a value of type Any unexpectedly, chances are good that the type inferencer couldn’t figure out what type to return, so it chose the most generic type possible.

Let’s look at examples where explicit declarations of method return types are required. In the following script, the upCase method has a conditional return statement for zero-length strings:

// code-examples/TypeLessDoMore/method-nested-return-script.scala
// ERROR: Won't compile until you put a String return type on upCase.

def upCase(s: String) = {
  if (s.length == 0)
    return s
  else
    s.toUpperCase()
}

println( upCase("") )
println( upCase("Hello") )

Running this script gives you the following error:

... 6: error: method upCase has return statement; needs result type
        return s
         ^

You can fix this error by changing the first line of the method to the following:

def upCase(s: String): String = {

Actually, for this particular script, an alternative fix is to remove the return keyword from the line. It is not needed for the code to work properly, but it illustrates our point.

Recursive methods also require an explicit return type. Recall our factorial method in Nesting Method Definitions. Let’s remove the : Int return type on the nested fact method:

// code-examples/TypeLessDoMore/method-recursive-return-script.scala
// ERROR: Won't compile until you put an Int return type on "fact".

def factorial(i: Int) = {
  def fact(i: Int, accumulator: Int) = {
    if (i <= 1)
      accumulator
    else
      fact(i - 1, i * accumulator)
  }

  fact(i, 1)
}

Now it fails to compile:

... 9: error: recursive method fact needs result type
            fact(i - 1, i * accumulator)
             ^

Overloaded methods can sometimes require an explicit return type. When one such method calls another, we have to add a return type to the one doing the calling, as in this example:

// code-examples/TypeLessDoMore/method-overloaded-return-script.scala
// Version 1 of "StringUtil" (with a compilation error).
// ERROR: Won't compile: needs a String return type on the second "joiner".

object StringUtil {
  def joiner(strings: List[String], separator: String): String =
    strings.mkString(separator)

  def joiner(strings: List[String]) = joiner(strings, " ")
}
import StringUtil._    // Import the joiner methods.

println( joiner(List("Programming", "Scala")) )

The two joiner methods concatenate a List of strings together. The first method also takes an argument for the separator string. The second method calls the first with a “default” separator of a single space.

If you run this script, you get the following error:

... 9: error: overloaded method joiner needs result type
    def joiner(strings: List[String]) = joiner(strings, "")
                                         ^

Since the second joiner method calls the first, it requires an explicit String return type. It should look like this:

def joiner(strings: List[String]): String = joiner(strings, " ")

The final scenario can be subtle, when a more general return type is inferred than what you expected. You usually see this error when you assign a value returned from a function to a variable with a more specific type. For example, you were expecting a String, but the function inferred an Any for the returned object. Let’s see a contrived example that reflects a bug where this scenario can occur:

// code-examples/TypeLessDoMore/method-broad-inference-return-script.scala
// ERROR: Won't compile; needs a String return type on the second "joiner".

def makeList(strings: String*) = {
  if (strings.length == 0)
    List(0)  // #1
  else
    strings.toList
}

val list: List[String] = makeList()

Running this script returns the following error:

...11: error: type mismatch;
 found   : List[Any]
 required: List[String]
val list: List[String] = makeList()
                          ^

We intended for makeList to return a List[String], but when strings.length equals zero, we returned List(0), incorrectly “assuming” that this expression is the correct way to create an empty list. In fact, we returned a List[Int] with one element, 0. We should have returned List(). Since the else expression returns a List[String], the result of strings.toList, the inferred return type for the method is the closest common supertype of List[Int] and List[String], which is List[Any]. Note that the compilation error doesn’t occur in the function definition. We only see it when we attempt to assign the value returned from makeList to a List[String] variable.

In this case, fixing the bug is the solution. Alternatively, when there isn’t a bug, it may be that the compiler just needs the “help” of an explicit return type declaration. Investigate the method that appears to return the unexpected type. In our experience, you often find that you modified that method (or another one in the call path) in such a way that the compiler now infers a more general return type than necessary. Add the explicit return type in this case.

Another way to prevent these problems is to always declare return types for methods, especially when defining methods for a public API. Let’s revisit our StringUtil example and see why explicit declarations are a good idea (adapted from [Smith2009a]).

Here is our StringUtil “API” again with a new method, toCollection:

// code-examples/TypeLessDoMore/string-util-v3.scala
// Version 3 of "StringUtil" (for all versions of Scala).

object StringUtil {
  def joiner(strings: List[String], separator: String): String =
    strings.mkString(separator)

  def joiner(strings: List[String]): String = strings.mkString(" ")

  def toCollection(string: String) = string.split(' ')
}

The toCollection method splits a string on spaces and returns an Array containing the substrings. The return type is inferred, which is a potential problem, as we will see. The method is somewhat contrived, but it will illustrate our point. Here is a client of StringUtil that uses this method:

// code-examples/TypeLessDoMore/string-util-client.scala

import StringUtil._

object StringUtilClient {
  def main(args: Array[String]) = {
    args foreach { s => toCollection(s).foreach { x => println(x) } }
  }
}

If you compile these files with scala, you can run the client as follows:

$ scala -cp ... StringUtilClient "Programming Scala"
Programming
Scala

Note

For the -cp ... class path argument, use the directory where scalac wrote the class files, which defaults to the current directory (i.e., use -cp .). If you used the build process in the downloaded code examples, the class files are written to the build directory (using scalac -d build ...). In this case, use -cp build.

Everything is fine at this point, but now imagine that the code base has grown. StringUtil and its clients are now built separately and bundled into different JARs. Imagine also that the maintainers of StringUtil decide to return a List instead of the default:

object StringUtil {
  ...

  def toCollection(string: String) = string.split(' ').toList  // changed!
}

The only difference is the final call to toList that converts the computed Array to a List. You recompile StringUtil and redeploy its JAR. Then you run the same client, without recompiling it first:

$ scala -cp ... StringUtilClient "Programming Scala"
java.lang.NoSuchMethodError: StringUtil$.toCollection(...
  at StringUtilClient$$anonfun$main$1.apply(string-util-client.scala:6)
  at StringUtilClient$$anonfun$main$1.apply(string-util-client.scala:6)
...

What happened? When the client was compiled, StringUtil.toCollection returned an Array. Then toCollection was changed to return List. In both versions, the method return value was inferred. Therefore, the client should have been recompiled, too.

However, had an explicit return type of Seq been declared, which is a parent for both Array and List, then the implementation change would not have forced a recompilation of the client.

Note

When developing APIs that are built separately from their clients, declare method return types explicitly and use the most general return type you can. This is especially important when APIs declare abstract methods (see, e.g., Chapter 4).

There is another scenario to watch for when using declarations of collections like val map = Map(), as in the following example:

val map = Map()

map.update("book", "Programming Scala")
... 3: error: type mismatch;
 found   : java.lang.String("book")
 required: Nothing
map.update("book", "Programming Scala")
            ^

What happened? The type parameters of the generic type Map were inferred as [Nothing,Nothing] when the map was created. (We’ll discuss Nothing in The Scala Type Hierarchy, but its name is suggestive!) We attempted to insert an incompatible key-value pair of types String and String. Call it a Map to nowhere! The solution is to parameterize the initial map declaration, e.g., val map = Map[String, String](), or to specify initial values so that the map parameters are inferred, e.g., val map = Map("Programming" → "Scala").

Finally, there is a subtle behavior with inferred return types that can cause unexpected and baffling results (see [ScalaTips]). Consider the following example scala session:

scala> def double(i: Int) { 2 * i }
double: (Int)Unit

scala> println(double(2))
()

Why did the second command print () instead of 4? Look carefully at what the scala interpreter said the first command returned: double (Int)Unit. We defined a method named double that takes an Int argument and returns Unit. The method doesn’t return an Int as we would expect.

The cause of this unexpected behavior is a missing equals sign in the method definition. Here is the definition we actually intended:

scala> def double(i: Int) = { 2 * i }
double: (Int)Int

scala> println(double(2))
4

Note the equals sign before the body of double. Now, the output says we have defined double to return an Int and the second command does what we expect it to do.

There is a reason for this behavior. Scala regards a method with the equals sign before the body as a function definition and a function always returns a value in functional programming. On the other hand, when Scala sees a method body without the leading equals sign, it assumes the programmer intended the method to be a “procedure” definition, meant for performing side effects only with the return value Unit. In practice, it is more likely that the programmer simply forgot to insert the equals sign!

Warning

When the return type of a method is inferred and you don’t use an equals sign before the opening parenthesis for the method body, Scala infers a Unit return type, even when the last expression in the method is a value of another type.

By the way, where did that () come from that was printed before we fixed the bug? It is actually the real name of the singleton instance of the Unit type! (This name is a functional programming convention.)

Literals

Often, a new object is initialized with a literal value, such as val book = "Programming Scala". Let’s discuss the kinds of literal values supported by Scala. Here, we’ll limit ourselves to lexical syntax literals. We’ll cover literal syntax for functions (used as values, not member methods), tuples, and certain types like Lists and Maps as we come to them.

Integer Literals

Integer literals can be expressed in decimal, hexadecimal, or octal. The details are summarized in Table 2-1.

Table 2-1. Integer literals
KindFormatExamples

Decimal

0 or a nonzero digit followed by zero or more digits (0–9)

0, 1, 321

Hexadecimal

0x followed by one or more hexadecimal digits (0–9, A–F, a–f)

0xFF, 0x1a3b

Octal

0 followed by one or more octal digits (0–7)

013, 077

For Long literals, it is necessary to append the L or l character at the end of the literal. Otherwise, an Int is used. The valid values for an integer literal are bounded by the type of the variable to which the value will be assigned. Table 2-2 defines the limits, which are inclusive.

Table 2-2. Ranges of allowed values for integer literals (boundaries are inclusive)
Target typeMinimum (inclusive)Maximum (inclusive)

Long

−263

263 − 1

Int

−231

231 − 1

Short

−215

215 − 1

Char

0

216 − 1

Byte

−27

27 − 1

A compile-time error occurs if an integer literal number is specified that is outside these ranges, as in the following examples:

scala > val i = 12345678901234567890
<console>:1: error: integer number too large
       val i = 12345678901234567890
scala> val b: Byte = 128
<console>:4: error: type mismatch;
 found   : Int(128)
 required: Byte
       val b: Byte = 128
                     ^

scala> val b: Byte = 127
b: Byte = 127

Floating-Point Literals

Floating-point literals are expressions with zero or more digits, followed by a period (.), followed by zero or more digits. If there are no digits before the period, i.e., the number is less than 1.0, then there must be one or more digits after the period. For Float literals, append the F or f character at the end of the literal. Otherwise, a Double is assumed. You can optionally append a D or d for a Double.

Floating-point literals can be expressed with or without exponentials. The format of the exponential part is e or E, followed by an optional + or -, followed by one or more digits.

Here are some example floating-point literals:

0.
.0
0.0
3.
3.14
.14
0.14
3e5
3E5
3.E5
3.e5
3.e+5
3.e-5
3.14e-5
3.14e-5f
3.14e-5F
3.14e-5d
3.14e-5D

Float consists of all IEEE 754 32-bit, single-precision binary floating-point values. Double consists of all IEEE 754 64-bit, double-precision binary floating-point values.

Warning

To avoid parsing ambiguities, you must have at least one space after a floating-point literal, if it is followed by a token that starts with a letter. Also, the expression 1.toString returns the integer value 1 as a string, while 1. toString uses the operator notation to invoke toString on the floating-point literal 1..

Boolean Literals

The boolean literals are true and false. The type of the variable to which they are assigned will be inferred to be Boolean:

scala> val b1 = true
b1: Boolean = true

scala> val b2 = false
b2: Boolean = false

Character Literals

A character literal is either a printable Unicode character or an escape sequence, written between single quotes. A character with a Unicode value between 0 and 255 may also be represented by an octal escape, i.e., a backslash () followed by a sequence of up to three octal characters. It is a compile-time error if a backslash character in a character or string literal does not start a valid escape sequence.

Here are some examples:

'A'
'u0041'  // 'A' in Unicode
'
'
'012'    // '
' in octal
'	'

The valid escape sequences are shown in Table 2-3.

Table 2-3. Character escape sequences
SequenceUnicodeMeaning



u0008

Backspace (BS)

u0009

Horizontal tab (HT)

u000a

Line feed (LF)

f

u000c

Form feed (FF)

u000d

Carriage return (CR)

"

u0022

Double quote (")

u0027

Single quote ()

\

u0009

Backslash ()

String Literals

A string literal is a sequence of characters enclosed in double quotes or triples of double quotes, i.e., """...""".

For string literals in double quotes, the allowed characters are the same as the character literals. However, if a double quote " character appears in the string, it must be “escaped” with a character. Here are some examples:

"Programming
Scala"
"He exclaimed, "Scala is great!""
"First	Second"

The string literals bounded by triples of double quotes are also called multi-line string literals. These strings can cover several lines; the line feeds will be part of the string. They can include any characters, including one or two double quotes together, but not three together. They are useful for strings with characters that don’t form valid Unicode or escape sequences, like the valid sequences listed in Table 2-3. Regular expressions are a typical example, which we’ll discuss in Chapter 3. However, if escape sequences appear, they aren’t interpreted.

Here are three example strings:

"""Programming
Scala"""
"""He exclaimed, "Scala is great!" """
"""First line

Second line	

Fourth line"""

Note that we had to add a space before the trailing """ in the second example to prevent a parse error. Trying to escape the second " that ends the "Scala is great!" quote, i.e., "Scala is great!", doesn’t work.

Copy and paste these strings into the scala interpreter. Do the same for the previous string examples. How are they interpreted differently?

Symbol Literals

Scala supports symbols, which are interned strings, meaning that two symbols with the same “name” (i.e., the same character sequence) will actually refer to the same object in memory. Symbols are used less often in Scala than in some other languages, like Ruby, Smalltalk, and Lisp. They are useful as map keys instead of strings.

A symbol literal is a single quote ('), followed by a letter, followed by zero or more digits and letters. Note that an expression like '1 is invalid, because the compiler thinks it is an incomplete character literal.

A symbol literal 'id is a shorthand for the expression scala.Symbol("id").

Note

If you want to create a symbol that contains whitespace, use e.g., scala.Symbol(" Programming Scala "). All the whitespace is preserved.

Tuples

How many times have you wanted to return two or more values from a method? In many languages, like Java, you only have a few options, none of which is very appealing. You could pass in parameters to the method that will be modified for all or some of the “return” values, which is ugly. Or you could declare some small “structural” class that holds the two or more values, then return an instance of that class.

Scala, supports tuples, a grouping of two or more items, usually created with the literal syntax of a comma-separated list of the items inside parentheses, e.g., (x1, x2, ...). The types of the xi elements are unrelated to each other; you can mix and match types. These literal “groupings” are instantiated as scala.TupleN instances, where N is the number of items in the tuple. The Scala API defines separate TupleN classes for N between 1 and 22, inclusive. Tuple instances are immutable, first-class values, so you can assign them to variables, pass them as values, and return them from methods.

The following example demonstrates the use of tuples:

// code-examples/TypeLessDoMore/tuple-example-script.scala

def tupleator(x1: Any, x2: Any, x3: Any) = (x1, x2, x3)

val t = tupleator("Hello", 1, 2.3)
println( "Print the whole tuple: " + t )
println( "Print the first item:  " + t._1 )
println( "Print the second item: " + t._2 )
println( "Print the third item:  " + t._3 )

val (t1, t2, t3) = tupleator("World", '!', 0x22)
println( t1 + " " + t2 + " " + t3 )

Running this script with scala produces the following output:

Print the whole tuple: (Hello,1,2.3)
Print the first item:  Hello
Print the second item: 1
Print the third item:  2.3
World ! 34

The tupleator method simply returns a “3-tuple” with the input arguments. The first statement that uses this method assigns the returned tuple to a single variable t. The next four statements print t in various ways. The first print statement calls Tuple3.toString, which wraps parentheses around the item list. The following three statements print each item in t separately. The expression t._N retrieves the N item, starting at 1, not 0 (this choice follows functional programming conventions).

The last two lines show that we can use a tuple expression on the lefthand side of the assignment. We declare three vals—t1, t2, and t3—to hold the individual items in the tuple. In essence, the tuple items are extracted automatically.

Notice how we mixed types in the tuples. You can see the types more clearly if you use the interactive mode of the scala command, which we introduced in Chapter 1.

Invoke the scala command with no script argument. At the scala> prompt, enter val t = ("Hello",1,2.3) and see that you get the following result, which shows you the type of each element in the tuple:

scala> val t = ("Hello",1,2.3)
t: (java.lang.String, Int, Double) = (Hello,1,2.3)

It’s worth noting that there’s more than one way to define a tuple. We’ve been using the more common parenthesized syntax, but you can also use the arrow operator between two values, as well as special factory methods on the tuple-related classes:

scala> 1 -> 2
res0: (Int, Int) = (1,2)

scala> Tuple2(1, 2)
res1: (Int, Int) = (1,2)

scala> Pair(1, 2)
res2: (Int, Int) = (1,2)

Option, Some, and None: Avoiding nulls

We’ll discuss the standard type hierarchy for Scala in The Scala Type Hierarchy. However, three useful classes to understand now are the Option class and its two subclasses, Some and None.

Most languages have a special keyword or object that’s assigned to reference variables when there’s nothing else for them to refer to. In Java, this is null; in Ruby, it’s nil. In Java, null is a keyword, not an object, and thus it’s illegal to call any methods on it. But this is a confusing choice on the language designer’s part. Why return a keyword when the programmer expects an object?

To be more consistent with the goal of making everything an object, as well as to conform with functional programming conventions, Scala encourages you to use the Option type for variables and function return values when they may or may not refer to a value. When there is no value, use None, an object that is a subclass of Option. When there is a value, use Some, which wraps the value. Some is also a subclass of Option.

Note

None is declared as an object, not a class, because we really only need one instance of it. In that sense, it’s like the null keyword, but it is a real object with methods.

You can see Option, Some, and None in action in the following example, where we create a map of state capitals in the United States:

// code-examples/TypeLessDoMore/state-capitals-subset-script.scala

val stateCapitals = Map(
  "Alabama" -> "Montgomery",
  "Alaska"  -> "Juneau",
  // ...
  "Wyoming" -> "Cheyenne")

println( "Get the capitals wrapped in Options:" )
println( "Alabama: " + stateCapitals.get("Alabama") )
println( "Wyoming: " + stateCapitals.get("Wyoming") )
println( "Unknown: " + stateCapitals.get("Unknown") )

println( "Get the capitals themselves out of the Options:" )
println( "Alabama: " + stateCapitals.get("Alabama").get )
println( "Wyoming: " + stateCapitals.get("Wyoming").getOrElse("Oops!") )
println( "Unknown: " + stateCapitals.get("Unknown").getOrElse("Oops2!") )

The convenient -> syntax for defining name-value pairs to initialize a Map will be discussed in The Predef Object. For now, we want to focus on the two groups of println statements, where we show what happens when you retrieve the values from the map. If you run this script with the scala command, you’ll get the following output:

Get the capitals wrapped in Options:
Alabama: Some(Montgomery)
Wyoming: Some(Cheyenne)
Unknown: None
Get the capitals themselves out of the Options:
Alabama: Montgomery
Wyoming: Cheyenne
Unknown: Oops2!

The first group of println statements invoke toString implicitly on the instances returned by get. We are calling toString on Some or None instances because the values returned by Map.get are automatically wrapped in a Some, when there is a value in the map for the specified key. Note that the Scala library doesn’t store the Some in the map; it wraps the value in a Some upon retrieval. Conversely, when we ask for a map entry that doesn’t exist, the None object is returned, rather than null. This occurred in the last println of the three.

The second group of println statements goes a step further. After calling Map.get, they call get or getOrElse on each Option instance to retrieve the value it contains. Option.get requires that the Option is not empty—that is, the Option instance must actually be a Some. In this case, get returns the value wrapped by the Some, as demonstrated in the println where we print the capital of Alabama. However, if the Option is actually None, then None.get throws a NoSuchElementException.

We also show the alternative method, getOrElse, in the last two println statements. This method returns either the value in the Option, if it is a Some instance, or it returns the second argument we passed to getOrElse, if it is a None instance. In other words, the second argument to getOrElse functions as the default return value.

So, getOrElse is the more defensive of the two methods. It avoids a potential thrown exception. We’ll discuss the merits of alternatives like get versus getOrElse in Exceptions and the Alternatives.

Note that because the Map.get method returns an Option, it automatically documents the fact that there may not be an item matching the specified key. The map handles this situation by returning a None. Most languages would return null (or the equivalent) when there is no “real” value to return. You learn from experience to expect a possible null. Using Option makes the behavior more explicit in the method signature, so it’s more self-documenting.

Also, thanks to Scala’s static typing, you can’t make the mistake of attempting to call a method on a value that might actually be null. While this mistake is easy to do in Java, it won’t compile in Scala because you must first extract the value from the Option. So, the use of Option strongly encourages more resilient programming.

Because Scala runs on the JVM and .NET and because it must interoperate with other libraries, Scala has to support null. Still, you should avoid using null in your code. Tony Hoare, who invented the null reference in 1965 while working on an object-oriented language called ALGOL W, called its invention his “billion dollar mistake” (see [Hoare2009]). Don’t contribute to that figure.

So, how would you write a method that returns an Option? Here is a possible implementation of get that could be used by a concrete subclass of Map (Map.get itself is abstract). For a more sophisticated version, see the implementation of get in scala.collection.immutable.HashMap in the Scala library source code distribution:

def get(key: A): Option[B] = {
  if (contains(key))
    new Some(getValue(key))
  else
    None
}

The contains method is also defined for Map. It returns true if the map contains a value for the specified key. The getValue method is intended to be an internal method that retrieves the value from the underlying storage, whatever it is.

Note how the value returned by getValue is wrapped in a Some[B], where the type B is inferred. However, if the call to contains(key) returns false, then the object None is returned.

You can use this same idiom when your methods return an Option. We’ll explore other uses for Option in subsequent sections. Its pervasive use in Scala code makes it an important concept to grasp.

Organizing Code in Files and Namespaces

Scala adopts the package concept that Java uses for namespaces, but Scala offers a more flexible syntax. Just as file names don’t have to match the type names, the package structure does not have to match the directory structure. So, you can define packages in files independent of their “physical” location.

The following example defines a class MyClass in a package com.example.mypkg using the conventional Java syntax:

// code-examples/TypeLessDoMore/package-example1.scala

package com.example.mypkg

class MyClass {
  // ...
}

The next example shows a contrived example that defines packages using the nested package syntax in Scala, which is similar to the namespace syntax in C# and the use of modules as namespaces in Ruby:

// code-examples/TypeLessDoMore/package-example2.scala

package com {
  package example {
    package pkg1 {
      class Class11 {
        def m = "m11"
      }
      class Class12 {
        def m = "m12"
      }
    }

    package pkg2 {
      class Class21 {
        def m = "m21"
        def makeClass11 = {
          new pkg1.Class11
        }
        def makeClass12 = {
          new pkg1.Class12
        }
      }
    }

    package pkg3.pkg31.pkg311 {
      class Class311 {
        def m = "m21"
      }
    }
  }
}

Two packages, pkg1 and pkg2, are defined under the com.example package. A total of three classes are defined between the two packages. The makeClass11 and makeClass12 methods in Class21 illustrate how to reference a type in the “sibling” package, pkg1. You can also reference these classes by their full paths, com.example.pkg1.Class11 and com.example.pkg1.Class12, respectively.

The package pkg3.pkg31.pkg311 shows that you can “chain” several packages together in one clause. It is not necessary to use a separate package clause for each package.

Following the conventions of Java, the root package for Scala’s library classes is named scala.

Warning

Scala does not allow package declarations in scripts that are executed directly with the scala interpreter. The reason has to do with the way the interpreter converts statements in scripts to valid Scala code before compiling to byte code. See The scala Command-Line Tool for more details.

Importing Types and Their Members

To use declarations in packages, you have to import them, just as you do in Java and similarly for other languages. However, compared to Java, Scala greatly expands your options. The following example illustrates several ways to import Java types:

// code-examples/TypeLessDoMore/import-example1.scala

import java.awt._
import java.io.File
import java.io.File._
import java.util.{Map, HashMap}

You can import all types in a package, using the underscore (_) as a wildcard, as shown on the first line. You can also import individual Scala or Java types, as shown on the second line.

Java uses the “star” character (*) as the wildcard for matching all types in a package or all static members of a type when doing “static imports.” In Scala, this character is allowed in method names, so _ is used as a wildcard, as we saw previously.

As shown on the third line, you can import all the static methods and fields in Java types. If java.io.File were actually a Scala object, as discussed previously, then this line would import the fields and methods from the object.

Finally, you can selectively import just the types you care about. On the fourth line, we import just the java.util.Map and java.util.HashMap types from the java.util package. Compare this one-line import statement with the two-line import statements we used in our first example in Inferring Type Information. They are functionally equivalent.

The next example shows more advanced options for import statements:

// code-examples/TypeLessDoMore/import-example2-script.scala

def writeAboutBigInteger() = {

  import java.math.BigInteger.{
    ONE => _,
    TEN,
    ZERO => JAVAZERO }

  // ONE is effectively undefined
  // println( "ONE: "+ONE )
  println( "TEN: "+TEN )
  println( "ZERO: "+JAVAZERO )
}

writeAboutBigInteger()

This example demonstrates two features. First, we can put import statements almost anywhere we want, not just at the top of the file, as required by Java. This feature allows us to scope the imports more narrowly. For example, we can’t reference the imported BigInteger definitions outside the scope of the method. Another advantage of this feature is that it puts an import statement closer to where the imported items are actually used.

The second feature shown is the ability to rename imported items. First, the java.math.BigInteger.ONE constant is renamed to the underscore wildcard. This effectively makes it invisible and unavailable to the importing scope. This is a useful technique when you want to import everything except a few particular items.

Next, the java.math.BigInteger.TEN constant is imported without renaming, so it can be referenced simply as TEN.

Finally, the java.math.BigInteger.ZERO constant is given the “alias” JAVAZERO.

Aliasing is useful if you want to give the item a more convenient name or you want to avoid ambiguities with other items in scope that have the same name.

Imports are Relative

There’s one other important thing to know about imports: they are relative. Note the comments for the following imports:

// code-examples/TypeLessDoMore/relative-imports.scala
import scala.collection.mutable._
import collection.immutable._         // Since "scala" is already imported
import _root_.scala.collection.jcl._  // full path from real "root"
package scala.actors {
  import remote._                     // We're in the scope of "scala.actors"
}

Note that the last import statement nested in the scala.actor package scope is relative to that scope.

The [ScalaWiki] has other examples at http://scala.sygneca.com/faqs/language#how-do-i-import.

It’s fairly rare that you’ll have problems with relative imports, but the problem with this convention is that they sometimes cause surprises, especially if you are accustomed to languages like Java, where imports are absolute. If you get a mystifying compiler error that a package wasn’t found, check that the statement is properly relative to the last import statement or add the _root_. prefix. Also, you might see an IDE or other tool insert an import _root_... statement in your code. Now you know what it means.

Warning

Remember that import statements are relative, not absolute. To create an absolute path, start with _root_.

Abstract Types And Parameterized Types

We mentioned in A Taste of Scala that Scala supports parameterized types, which are very similar to generics in Java. (We could use the two terms interchangeably, but it’s more common to use “parameterized types” in the Scala community and “generics” in the Java community.) The most obvious difference is in the syntax, where Scala uses square brackets ([...]), while Java uses angle brackets (<...>).

For example, a list of strings would be declared as follows:

val languages: List[String] = ...

There are other important differences with Java’s generics, which we’ll explore in Understanding Parameterized Types.

For now, we’ll mention one other useful detail that you’ll encounter before we can explain it in depth in Chapter 12. If you look at the declaration of scala.List in the Scaladocs, you’ll see that the declaration is written as ... class List[+A]. The + in front of the A means that List[B] is a subtype of List[A] for any B that is a subtype of A. If there is a - in front of a type parameter, then the relationship goes the other way; Foo[B] would be a supertype of Foo[A], if the declaration is Foo[-A].

Scala supports another type abstraction mechanism called abstract types, used in many functional programming languages, such as Haskell. Abstract types were also considered for inclusion in Java when generics were adopted. We want to introduce them now because you’ll see many examples of them before we dive into their details in Chapter 12. For a very detailed comparison of these two mechanisms, see [Bruce1998].

Abstract types can be applied to many of the same design problems for which parameterized types are used. However, while the two mechanisms overlap, they are not redundant. Each has strengths and weaknesses for certain design problems.

Here is an example that uses an abstract type:

// code-examples/TypeLessDoMore/abstract-types-script.scala

import java.io._

abstract class BulkReader {
  type In
  val source: In
  def read: String
}

class StringBulkReader(val source: String) extends BulkReader {
  type In = String
  def read = source
}

class FileBulkReader(val source: File) extends BulkReader {
  type In = File
  def read = {
    val in = new BufferedInputStream(new FileInputStream(source))
    val numBytes = in.available()
    val bytes = new Array[Byte](numBytes)
    in.read(bytes, 0, numBytes)
    new String(bytes)
  }
}

println( new StringBulkReader("Hello Scala!").read )
println( new FileBulkReader(new File("abstract-types-script.scala")).read )

Running this script with scala produces the following output:

Hello Scala!
import java.io._

abstract class BulkReader {
...

The BulkReader abstract class declares three abstract members: a type named In, a val field source, and a read method. As in Java, instances in Scala can only be created from concrete classes, which must have definitions for all members.

The derived classes, StringBulkReader and FileBulkReader, provide concrete definitions for these abstract members. We’ll cover the details of class declarations in Chapter 5 and the particulars of overriding member declarations in Overriding Members of Classes and Traits in Chapter 6.

For now, note that the type field works very much like a type parameter in a parameterized type. In fact, we could rewrite this example as follows, where we show only what would be different:

abstract class BulkReader[In] {
  val source: In
  ...
}

class StringBulkReader(val source: String) extends BulkReader[String] {...}

class FileBulkReader(val source: File) extends BulkReader[File] {...}

Just as for parameterized types, if we define the In type to be String, then the source field must also be defined as a String. Note that the StringBulkReader’s read method simply returns the source field, while the FileBulkReader’s read method reads the contents of the file.

As demonstrated by [Bruce1998], parameterized types tend to be best for collections, which is how they are most often used in Java code, whereas abstract types are most useful for type “families” and other type scenarios.

We’ll explore the details of Scala’s abstract types in Chapter 12. For example, we’ll see how to constrain the possible concrete types that can be used.

Reserved Words

Table 2-4 lists the reserved words in Scala, which we sometimes call “keywords,” and briefly describes how they are used (see [ScalaSpec2009]).

Table 2-4. Reserved words
WordDescriptionSee …

abstract

Makes a declaration abstract. Unlike Java, the keyword is usually not required for abstract members.

Class and Object Basics

case

Start a case clause in a match expression.

Pattern Matching

catch

Start a clause for catching thrown exceptions.

Using try, catch, and finally Clauses

class

Start a class declaration.

Class and Object Basics

def

Start a method declaration.

Method Declarations

do

Start a do...while loop.

Other Looping Constructs

else

Start an else clause for an if clause.

Scala if Statements

extends

Indicates that the class or trait that follows is the parent type of the class or trait being declared.

Parent Classes

false

Boolean false.

The Scala Type Hierarchy

final

Applied to a class or trait to prohibit deriving child types from it. Applied to a member to prohibit overriding it in a derived class or trait.

Attempting to Override final Declarations

finally

Start a clause that is executed after the corresponding try clause, whether or not an exception is thrown by the try clause.

Using try, catch, and finally Clauses

for

Start a for comprehension (loop).

Scala for Comprehensions

forSome

Used in existential type declarations to constrain the allowed concrete types that can be used.

Existential Types

if

Start an if clause.

Scala if Statements

implicit

Marks a method as eligible to be used as an implicit type converter. Marks a method parameter as optional, as long as a type-compatible substitute object is in the scope where the method is called.

Implicit Conversions

import

Import one or more types or members of types into the current scope.

Importing Types and Their Members

lazy

Defer evaluation of a val.

Lazy Vals

match

Start a pattern matching clause.

Pattern Matching

new

Create a new instance of a class.

Class and Object Basics

null

Value of a reference variable that has not been assigned a value.

The Scala Type Hierarchy

object

Start a singleton declaration: a class with only one instance.

Classes and Objects: Where Are the Statics?

override

Override a concrete member of a class or trait, as long as the original is not marked final.

Overriding Members of Classes and Traits

package

Start a package scope declaration.

Organizing Code in Files and Namespaces

private

Restrict visibility of a declaration.

Visibility Rules

protected

Restrict visibility of a declaration.

Visibility Rules

requires

Deprecated. Was used for self typing.

The Scala Type Hierarchy

return

Return from a function.

A Taste of Scala

sealed

Applied to a parent class to require all directly derived classes to be declared in the same source file.

Case Classes

super

Analogous to this, but binds to the parent type.

Overriding Abstract and Concrete Methods

this

How an object refers to itself. The method name for auxiliary constructors.

Class and Object Basics

throw

Throw an exception.

Using try, catch, and finally Clauses

trait

A mixin module that adds additional state and behavior to an instance of a class.

Chapter 4

try

Start a block that may throw an exception.

Using try, catch, and finally Clauses

true

Boolean true.

The Scala Type Hierarchy

type

Start a type declaration.

Abstract Types And Parameterized Types

val

Start a read-only “variable” declaration.

Variable Declarations

var

Start a read-write variable declaration.

Variable Declarations

while

Start a while loop.

Other Looping Constructs

with

Include the trait that follows in the class being declared or the object being instantiated.

Chapter 4

yield

Return an element in a for comprehension that becomes part of a sequence.

Yielding

_

A placeholder, used in imports, function literals, etc.

Many

:

Separator between identifiers and type annotations.

A Taste of Scala

=

Assignment.

A Taste of Scala

=>

Used in function literals to separate the argument list from the function body.

Function Literals and Closures

<-

Used in for comprehensions in generator expressions.

Scala for Comprehensions

<:

Used in parameterized and abstract type declarations to constrain the allowed types.

Type Bounds

<%

Used in parameterized and abstract type “view bounds” declarations.

Type Bounds

>:

Used in parameterized and abstract type declarations to constrain the allowed types.

Type Bounds

#

Used in type projections.

Path-Dependent Types

@

Marks an annotation.

Annotations

(Unicode u21D2) Same as =>.

Function Literals and Closures

(Unicode u2190) Same as <-.

Scala for Comprehensions

Notice that break and continue are not listed. These control keywords don’t exist in Scala. Instead, Scala encourages you to use functional programming idioms that are usually more succinct and less error-prone. We’ll discuss alternative approaches when we discuss for loops (see Generator Expressions).

Some Java methods use names that are reserved by Scala, for example, java.util.Scanner.match. To avoid a compilation error, surround the name with single back quotes, e.g., java.util.Scanner.‵match‵.

Recap and What’s Next

We covered several ways that Scala’s syntax is concise, flexible, and productive. We also described many Scala features. In the next chapter, we will round out some Scala essentials before we dive into Scala’s support for object-oriented programming and functional programming.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset