In the eyes of many, the new collections framework is the most significant change in Scala 2.8. Scala had collections before (and in fact the new framework is largely compatible with them). But it's only 2.8 that provides a common, uniform, and all-encompassing framework for collection types.
Even though the additions to collections are subtle at first glance, the changes they can provoke in your programming style can be profound. In fact, quite often it's as if you work on a higher level with the basic building blocks of a program being whole collections instead of their elements. This new style of programming requires some adaptation. Fortunately, the adaptation is helped by several nice properties of the new Scala collections. They are easy to use, concise, safe, fast, and universal.
This chapter describes in depth the APIs of the Scala 2.8 collection classes from a user perspective. You've already seen a quick tour of the collections library, in Chapter 17. This chapter takes you on a more detailed tour, showing all the collection classes and all the methods they define, so it includes everything you need to know to use Scala collections. Looking ahead, Chapter 25 will concentrate on the architecture and extensibility aspects of the library, for people implementing new collection types.
As is now familiar to you, Scala collections systematically distinguish between mutable and immutable collections. A mutable collection can be updated or extended in place. This means you can change, add, or remove elements of a collection as a side effect. Immutable collections, by contrast, never change. You still have operations that simulate additions, removals, or updates, but those operations will in each case return a new collection and leave the old collection unchanged.
All collection classes are found in the package scala.collection or one of its subpackages: mutable, immutable, and generic. Most collection classes needed by client code exist in three variants, each of which has different characteristics with respect to mutability. The three variants are located in packages scala.collection, scala.collection.immutable, and scala.collection.mutable.
A collection in package scala.collection.immutable is guaranteed to be immutable for everyone. Such a collection will never change after it is created. Therefore, you can rely on the fact that accessing the same collection value repeatedly at different points in time will always yield a collection with the same elements.
A collection in package scala.collection.mutable is known to have some operations that change the collection in place. These operations let you write code to mutate the collection yourself. However, you must be careful to understand and defend against any updates performed by other parts of the code base.
A collection in package scala.collection can be either mutable or immutable. For instance, scala.collection.IndexedSeq[T] is a supertrait of both scala.collection.immutable.IndexedSeq[T] and its mutable sibling scala.collection.mutable.IndexedSeq[T]. Generally, the root collections in package scala.collection define the same interface as the immutable collections. And typically, the mutable collections in package scala.collection.mutable add some side-effecting modification operations to this immutable interface.
The difference between root collections and immutable collections is that clients of an immutable collection have a guarantee that nobody can mutate the collection, whereas clients of a root collection only know that they can't change the collection themselves. Even though the static type of such a collection provides no operations for modifying the collection, it might still be possible that the run-time type is a mutable collection that can be changed by other clients.
By default, Scala always picks immutable collections. For instance, if you just write Set without any prefix or without having imported anything, you get an immutable set, and if you write Iterable you get an immutable iterable, because these are the default bindings imported from the scala package. To get the mutable default versions, you need to write explicitly collection.mutable.Set, or collection.mutable.Iterable.
The last package in the collection hierarchy is collection.generic. This package contains building blocks for implementing collections. Typically, collection classes defer the implementations of some of their operations to classes in generic. Everyday users of the collection framework on the other hand should need to refer to classes in generic only in exceptional circumstances.
Traversable Iterable Seq IndexedSeq Vector ResizableArray GenericArray LinearSeq MutableList List Stream Buffer ListBuffer ArrayBuffer Set SortedSet TreeSet HashSet (mutable) LinkedHashSet HashSet (immutable) BitSet EmptySet, Set1, Set2, Set3, Set4 Map SortedMap TreeMap HashMap (mutable) LinkedHashMap (mutable) HashMap (immutable) EmptyMap, Map1, Map2, Map3, Map4
The most important collection classes are shown in Figure 24.1. There is quite a bit of commonality shared by all these classes. For instance, every kind of collection can be created by the same uniform syntax, writing the collection class name followed by its elements:
Traversable(1, 2, 3) Iterable("x", "y", "z") Map("x" -> 24, "y" -> 25, "z" -> 26) Set(Color.Red, Color.Green, Color.Blue) SortedSet("hello", "world") Buffer(x, y, z) IndexedSeq(1.0, 2.0) LinearSeq(a, b, c)
The same principle also applies for specific collection implementations:
List(1, 2, 3) HashMap("x" -> 24, "y" -> 25, "z" -> 26)
The toString methods for all collections produce output written as above, with a type name followed by the elements of the collection in parentheses. All collections support the API provided by Traversable, but their methods all return their own class rather than the root class Traversable. For instance, the map method on List has a return type of List, whereas the map method on Set has a return type of Set. Thus the static return type of these methods is fairly precise:
scala> List(1, 2, 3) map (_ + 1) res0: List[Int] = List(2, 3, 4)
scala> Set(1, 2, 3) map (_ * 2) res1: scala.collection.immutable.Set[Int] = Set(2, 4, 6)
Equality is also organized uniformly for all collection classes; more on this in Section 24.14.
Most of the classes in Figure 24.1 exist in three variants: root, mutable, and immutable. The only exception is the Buffer trait, which only exists as a mutable collection.
In the remainder of this chapter, we will review these classes one by one.
At the top of the collection hierarchy is trait Traversable. Its only abstract operation is foreach:
def foreach[U](f: Elem => U)
Collection classes implementing Traversable just need to define this method; all other methods can be inherited from Traversable.
The foreach method is meant to traverse all elements of the collection, and apply the given operation, f, to each element. The type of the operation is Elem => U, where Elem is the type of the collection's elements and U is an arbitrary result type. The invocation of f is done for its side effect only; in fact any function result of f is discarded by foreach.
Traversable also defines many concrete methods, which are all listed in Table 24.1 here. These methods fall into the following categories:
The next trait from the top in Figure 24.1 is Iterable. All methods in this trait are defined in terms of an abstract method, iterator, which yields the collection's elements one by one. The foreach method from trait Traversable is implemented in Iterable in terms of iterator. Here is the actual implementation:
def foreach[U](f: Elem => U): Unit = { val it = iterator while (it.hasNext) f(it.next()) }
Quite a few subclasses of Iterable override this standard implementation of foreach in Iterable, because they can provide a more efficient implementation. Remember that foreach is the basis of the implementation of all operations in Traversable, so its performance matters.
Two more methods exist in Iterable that return iterators: grouped and sliding. These iterators, however, do not return single elements but whole subsequences of elements of the original collection. The maximal size of these subsequences is given as an argument to these methods. The grouped method chunks its elements into increments, whereas sliding yields a sliding window over the elements. The difference between the two should become clear by looking at the following interpreter interaction:
scala> val xs = List(1, 2, 3, 4, 5) xs: List[Int] = List(1, 2, 3, 4, 5)
scala> val git = xs grouped 3 git: Iterator[List[Int]] = non-empty iterator
scala> git.next() res2: List[Int] = List(1, 2, 3)
scala> git.next() res3: List[Int] = List(4, 5)
scala> val sit = xs sliding 3 sit: Iterator[List[Int]] = non-empty iterator
scala> sit.next() res4: List[Int] = List(1, 2, 3)
scala> sit.next() res5: List[Int] = List(2, 3, 4)
scala> sit.next() res6: List[Int] = List(3, 4, 5)
Trait Iterable also adds some other methods to Traversable that can be implemented efficiently only if an iterator is available. They are summarized in Table 24.2:
You might wonder why the extra trait Traversable is above Iterable. Can we not do everything with an iterator? So what's the point of having a more abstract trait that defines its methods in terms of foreach instead of iterator? One reason for having Traversable is that sometimes it is easier or more efficient to provide an implementation of foreach than to provide an implementation of iterator. Here's a simple example. Let's say you want a class hierarchy for binary trees that have integer elements at the leaves. You might design this hierarchy like this:
sealed abstract class Tree case class Branch(left: Tree, right: Tree) extends Tree case class Node(elem: Int) extends Tree
Now assume you want to make trees traversable. To do this, have Tree inherit from Traversable[Int] and define a foreach method like this:
sealed abstract class Tree extends Traversable[Int] { def foreach[U](f: Int => U) = this match { case Node(elem) => f(elem) case Branch(l, r) => l foreach f; r foreach f } }
That's not too hard, and it is also very efficient—traversing a balanced tree takes time proportional to the number of elements in the tree. To see this, consider that for a balanced tree with N leaves you will have N - 1 interior nodes of class Branch. So the total number of steps to traverse the tree is N + N - 1.
Now, compare this with making trees iterable. To do this, have Tree inherit from Iterable[Int] and define an iterator method like this:
sealed abstract class Tree extends Iterable[Int] { def iterator: Iterator[Int] = this match { case Node(elem) => Iterator.single(elem) case Branch(l, r) => l.iterator ++ r.iterator } }
At first glance, this looks no harder than the foreach solution. However, there's an efficiency problem that has to do with the implementation of the iterator concatenation method, ++. Every time an element is produced by a concatenated iterator such as l.iterator ++ r.iterator, the computation needs to follow one indirection to get at the right iterator (either l.iterator, or r.iterator). Overall, that makes log(N) indirections to get at a leaf of a balanced tree with N leaves. So the cost of visiting all elements of a tree went up from about 2N for the foreach traversal method to N log(N) for the traversal with iterator. If the tree has a million elements that means about two million steps for foreach and about twenty million steps for iterator. So the foreach solution has a clear advantage.
In the inheritance hierarchy below Iterable you find three traits: Seq, Set, and Map. A common aspect of these three traits is that they all implement the PartialFunction trait[1] with its apply and isDefinedAt methods. However, the way each trait implements PartialFunction differs.
For sequences, apply is positional indexing, where elements are always numbered from 0. That is, Seq(1, 2, 3)(1) == 2. For sets, apply is a membership test. For instance, Set('a', 'b', 'c')('b') == true whereas Set()('a') == false. Finally for maps, apply is a selection. For instance, Map('a' -> 1, 'b' -> 10, 'c' -> 100)('b') == 10.
In the following three sections, we will explain each of the three kinds of collections in more detail.
The Seq trait represents sequences. A sequence is a kind of iterable that has a length and whose elements have fixed index positions, starting from 0.
The operations on sequences, summarized in Figure 24.3, fall into the following categories:
If a sequence is mutable, it offers in addition a side-effecting update method, which lets sequence elements be updated. Recall from Chapter 3 that syntax like seq(idx) = elem is just a shorthand for seq.update(idx, elem). Note the difference between update and updated. The update method changes a sequence element in place, and is only available for mutable sequences. The updated method is available for all sequences and always returns a new sequence instead of modifying the original.
Each Seq trait has two subtraits, LinearSeq and IndexedSeq. These do not add any new operations, but each offers different performance characteristics. A linear sequence has efficient head and tail operations, whereas an indexed sequence has efficient apply, length, and (if mutable) update operations. List is a frequently used linear sequence, as is Stream. Two frequently used indexed sequences are Array and ArrayBuffer. The Vector class provides an interesting compromise between indexed and linear access. It has both effectively constant time indexing overhead and constant time linear access overhead. Because of this, vectors are a good foundation for mixed access patterns where both indexed and linear accesses are used. More on vectors in Section 24.9.
An important sub-category of mutable sequences is buffers. Buffers allow not only updates of existing elements but also element insertions, element removals, and efficient additions of new elements at the end of the buffer. The principal new methods supported by a buffer are += and ++=, for element addition at the end, +=: and ++=: for addition at the front, insert and insertAll for element insertions, as well as remove and -= for element removal. These operations are summarized in Table 24.4.
Two Buffer implementations that are commonly used are ListBuffer and ArrayBuffer. As the name implies, a ListBuffer is backed by a List and supports efficient conversion of its elements to a List, whereas an ArrayBuffer is backed by an array, and can be quickly converted into one. You saw a glimpse of the implementation of ListBuffer in Section 22.2.
Sets are Iterables that contain no duplicate elements. The operations on sets are summarized in Table 24.5 for general sets and Table 24.6 for mutable sets. They fall into the following categories:
contains, apply, and subsetOf. The contains method indicates whether a set contains a given element. The apply method for a set is the same as contains, so set(elem) is the same as set contains elem. That means sets can also be used as test functions that return true for the elements they contain. For example:
scala> val fruit = Set("apple", "orange", "peach", "banana") fruit: scala.collection.immutable.Set[java.lang.String] = Set(apple, orange, peach, banana)
scala> fruit("peach") res7: Boolean = true
scala> fruit("potato") res8: Boolean = false
Mutable sets have methods that add, remove, or update elements, which are summarized in Table 24.6:
Just like an immutable set, a mutable set offers the + and ++ operations for element additions and the - and -- operations for element removals. But these are less often used for mutable sets since they involve copying the set. As a more efficient alternative, mutable sets offer the update methods += and -=. The operation s += elem adds elem to the set s as a side effect, and returns the mutated set as a result. Likewise, s -= elem removes elem from the set, and returns the mutated set as a result. Besides += and -= there are also the bulk operations ++= and --=, which add or remove all elements of a traversable or an iterator.
The choice of the method names += and -= means that very similar code can work with either mutable or immutable sets. Consider first the following interpreter dialogue that uses an immutable set s:
scala> var s = Set(1, 2, 3) s: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> s += 4; s -= 2
scala> s res9: scala.collection.immutable.Set[Int] = Set(1, 3, 4)
In this example, we used += and -= on a var of type immutable.Set. As was explained in Step 10 in Chapter 3, a statement such as s += 4 is an abbreviation for s = s + 4. So this invokes the addition method + on the set s and then assigns the result back to the s variable. Consider now an analogous interaction with a mutable set:
scala> val s = collection.mutable.Set(1, 2, 3) s: scala.collection.mutable.Set[Int] = Set(1, 2, 3)
scala> s += 4 res10: s.type = Set(1, 4, 2, 3)
scala> s -= 2 res11: s.type = Set(1, 4, 3)
The end effect is very similar to the previous interaction; we start with a Set(1, 2, 3) and end up with a Set(1, 3, 4). However, even though the statements look the same as before, they do something different. The s += 4 statement now invokes the += method on the mutable set value s, changing the set in place. Likewise, the s -= 2 statement now invokes the -= method on the same set.
Comparing the two interactions shows an important principle. You often can replace a mutable collection stored in a val by an immutable collection stored in a var, and vice versa. This works at least as long as there are no alias references to the collection through which you can observe whether it was updated in place or a new collection was created.
Mutable sets also provide add and remove as variants of += and -=. The difference is that add and remove return a boolean result indicating whether the operation had an effect on the set.
The current default implementation of a mutable set uses a hash table to store the set's elements. The default implementation of an immutable set uses a representation that adapts to the number of elements of the set. An empty set is represented by just a singleton object. Sets of sizes up to four are represented by a single object that stores all elements as fields. Beyond that size, immutable sets are implemented as hash tries.[2]
A consequence of these representation choices is that for sets of small sizes, up to about four, immutable sets are more compact and more efficient than mutable sets. So if you expect the size of a set to be small, try to make it immutable.
Two Set subtraits are SortedSet and BitSet. These are explained in the following subsections.
A SortedSet is a set where, no matter what order elements were added to the set, the elements are traversed in sorted order. The default representation of a SortedSet is an ordered binary tree maintaining the invariant that all elements in the left subtree of a node are smaller than all elements in the right subtree. That way, a simple in-order traversal can return all tree elements in increasing order. Scala's class immutable.TreeSet uses a red-black tree implementation to maintain this ordering invariant, and at the same time keep the tree balanced—meaning that all paths from the root of the tree to a leaf have about the same length.
To create an empty tree set, you could first specify the desired ordering. For example, here is an ordering that puts strings in reverse order:
scala> val myOrdering = Ordering.fromLessThan[String](_ > _) myOrdering: scala.math.Ordering[String] = ...
Then, to create an empty tree set with that ordering, use:
scala> import scala.collection.immutable.TreeSet import scala.collection.immutable.TreeSet
scala> TreeSet.empty(myOrdering) res12: scala.collection.immutable.TreeSet[String] = TreeSet()
Or you can leave out the ordering argument but give an element type of the empty set. In that case, the default ordering on the element type will be used:
scala> val set = TreeSet.empty[String] set: scala.collection.immutable.TreeSet[String] = TreeSet()
If you create new sets from a tree set (for instance by concatenation or filtering), they will keep the same ordering as the original set. For example:
scala> val numbers = set + ("one", "two", "three", "four") numbers: scala.collection.immutable.TreeSet[String] = TreeSet(four, one, three, two)
Sorted sets also support ranges of elements. For instance, the range method returns all elements from a starting element up to, but excluding, an end element. Or, the from method returns all elements greater than or equal to a starting element in the set's ordering. The result of calls to both methods is again a sorted set. Here are some examples:
scala> numbers range ("one", "two") res13: scala.collection.immutable.TreeSet[String] = TreeSet(one, three)
scala> numbers from "three" res14: scala.collection.immutable.TreeSet[String] = TreeSet(three, two)
Bit sets are sets of non-negative integer elements that are implemented in one or more words of packed bits. The internal representation of a bit set uses an array of Longs. The first Long covers elements from 0 to 63, the second from 64 to 127, and so on.[3] For every Long, each of its 64 bits is set to 1 if the corresponding element is contained in the set, and is unset otherwise. It follows that the size of a bit set depends on the largest integer that's stored in it. If N is that largest integer, then the size of the set is N/64 Long words, or N/8 bytes, plus a small number of extra bytes for status information.
Bitsets are hence more compact than other sets if they contain many small elements. Another advantage of bit sets is that operations such as membership test with contains, or element addition and removal with += and -=, are all extremely efficient.
Maps are Iterables of pairs of keys and values (also named mappings or associations). As explained in Section 21.4, Scala's Predef class offers an implicit conversion that lets you write key -> value as an alternate syntax for the pair (key, value). Therefore, Map("x" -> 24, "y" -> 25, "z" -> 26) means exactly the same as Map(("x", 24), ("y", 25), ("z", 26)), but reads better.
The fundamental operations on maps, summarized in Table 24.7, are similar to those on sets. Mutable maps additionally support the operations shown in Table 24.8. Map operations fall into the following categories:
apply, get, getOrElse, contains, and isDefinedAt. These operations turn maps into partial functions from keys to values. The fundamental lookup method for a map is:
def get(key): Option[Value]
The operation "m get key" tests whether the map contains an association for the given key. If so, it returns the associated value in a Some. If no key is defined in the map, get returns None. Maps also define an apply method that returns the value associated with a given key directly, without wrapping it in an Option. If the key is not defined in the map, an exception is raised.
The addition and removal operations for maps mirror those for sets. As for sets, mutable maps also support the non-destructive addition operations +, -, and updated, but they are used less frequently because they involve a copying of the mutable map. Instead, a mutable map m is usually updated "in place," using the two variants m(key) = value or m += (key -> value). There is also the variant m put (key, value), which returns an Option value that contains the value previously associated with key, or None if the key did not exist in the map before.
The getOrElseUpdate is useful for accessing maps that act as caches. Say you have an expensive computation triggered by invoking a function f:
scala> def f(x: String) = { println("taking my time."); Thread.sleep(100) x.reverse } f: (x: String)String
Assume further that f has no side-effects, so invoking it again with the same argument will always yield the same result. In that case you could save time by storing previously computed bindings of argument and results of f in a map, and only computing the result of f if a result of an argument was not found there. You could say the map is a cache for the computations of the function f.
scala> val cache = collection.mutable.Map[String, String]() cache: scala.collection.mutable.Map[String,String] = Map()
You can now create a more efficient caching version of the f function:
scala> def cachedF(s: String) = cache.getOrElseUpdate(s, f(s)) cachedF: (s: String)String
scala> cachedF("abc") taking my time. res15: String = cba
scala> cachedF("abc") res16: String = cba
Note that the second argument to getOrElseUpdate is "by-name," so the computation of f("abc") above is only performed if getOrElseUpdate requires the value of its second argument, which is precisely if its first argument is not found in the cache map. You could also have implemented cachedF directly, using just basic map operations, but it would have have taken more code to do so:
def cachedF(arg: String) = cache get arg match { case Some(result) => result case None => val result = f(arg) cache(arg) = result result }
In Section 1.1, we mentioned that if you needed a thread-safe map, you could mix the SynchronizedMap trait into whatever particular map implementation you desired. For example, you could mix SynchronizedMap into HashMap, as shown in Listing 24.1. This example begins with an import of two traits, Map and SynchronizedMap, and one class, HashMap, from package scala.collection.mutable. The rest of the example is the definition of singleton object MapMaker, which declares one method, makeMap. The makeMap method declares its result type to be a mutable map of string keys to string values.
import scala.collection.mutable.{Map, SynchronizedMap, HashMap}
object MapMaker {
def makeMap: Map[String, String] = {
new HashMap[String, String] with SynchronizedMap[String, String] {
override def default(key: String) = "Why do you want to know?" } } }
The first statement inside the body of makeMap constructs a new mutable HashMap that mixes in the SynchronizedMap trait:
new HashMap[String, String] with SynchronizedMap[String, String]
Given this code, the Scala compiler will generate a synthetic subclass of HashMap that mixes in SynchronizedMap, and create (and return) an instance of it. This synthetic class will also override a method named default, because of this code:
override def default(key: String) = "Why do you want to know?"
If you ask a map to give you the value for a particular key, but it doesn't have a mapping for that key, you'll by default get a NoSuchElementException. If you define a new map class and override the default method, however, your new map will return the value returned by default when queried with a non-existent key. Thus, the synthetic HashMap subclass generated by the compiler from the code in Listing 24.1 will return the somewhat curt response string, "Why do you want to know?", when queried with a non-existent key.
Because the mutable map returned by the makeMap method mixes in the SynchronizedMap trait, it can be used by multiple threads at once. Each access to the map will be synchronized. Here's an example of the map being used, by one thread, in the interpreter:
scala> val capital = MapMaker.makeMap capital: scala.collection.mutable.Map[String,String] = Map()
scala> capital ++= List("US" -> "Washington", "France" -> "Paris", "Japan" -> "Tokyo") res17: scala.collection.mutable.Map[String,String] = Map((France,Paris), (US,Washington), (Japan,Tokyo))
scala> capital("Japan") res18: String = Tokyo
scala> capital("New Zealand") res19: String = Why do you want to know?
scala> capital += ("New Zealand" -> "Wellington") res20: capital.type = Map((New Zealand,Wellington),...
scala> capital("New Zealand") res21: String = Wellington
You can create synchronized sets similarly to the way you create synchronized maps. For example, you could create a synchronized HashSet by mixing in the SynchronizedSet trait, like this:
import scala.collection.mutable
val synchroSet = new mutable.HashSet[Int] with mutable.SynchronizedSet[Int]
Finally, if you are thinking of using synchronized collections, you may also wish to consider the concurrent collections of java.util.concurrent instead. Alternatively, you may prefer to use unsynchronized collections with Scala actors. Actors will be covered in detail in Chapter 32.
Scala provides many concrete immutable collection classes for you to choose from. They differ in the traits they implement (maps, sets, sequences), whether they can be infinite, and the speed of various operations. We'll start by reviewing the most common immutable collection types.
Lists are finite immutable sequences. They provide constant-time access to their first element as well as the rest of the list, and they have a constant-time cons operation for adding a new element to the front of the list. Many other operations take linear time. See Chapters 16 and 22 for extensive discussions about lists.
A stream is like a list except that its elements are computed lazily. Because of this, a stream can be infinitely long. Only those elements requested will be computed. Otherwise, streams have the same performance characteristics as lists.
Whereas lists are constructed with the :: operator, streams are constructed with the similar-looking #::. Here is a simple example of a stream containing the integers 1, 2, and 3:
scala> val str = 1 #:: 2 #:: 3 #:: Stream.empty str: scala.collection.immutable.Stream[Int] = Stream(1, ?)
The head of this stream is 1, and the tail of it has 2 and 3. The tail is not printed here, though, because it hasn't been computed yet! Streams are required to compute lazily, and the toString method of a stream is careful not to force any extra evaluation.
Below is a more complex example. It computes a stream that contains a Fibonacci sequence starting with the given two numbers. A Fibonacci sequence is one where each element is the sum of the previous two elements in the series:
scala> def fibFrom(a: Int, b: Int): Stream[Int] = a #:: fibFrom(b, a + b) fibFrom: (a: Int,b: Int)Stream[Int]
This function is deceptively simple. The first element of the sequence is clearly a, and the rest of the sequence is the Fibonacci sequence starting with b followed by a + b. The tricky part is computing this sequence without causing an infinite recursion. If the function used :: instead of #::, then every call to the function would result in another call, thus causing an infinite recursion. Since it uses #::, though, the right-hand side is not evaluated until it is requested.
Here are the first few elements of the Fibonacci sequence starting with two ones:
scala> val fibs = fibFrom(1, 1).take(7) fibs: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> fibs.toList res22: List[Int] = List(1, 1, 2, 3, 5, 8, 13)
Lists are very efficient when the algorithm processing them is careful to only process their heads. Accessing, adding, and removing the head of a list takes only constant time, whereas accessing or modifying elements later in the list takes time linear in the depth into the list.
Vectors are a new collection type in Scala 2.8 that give efficient access to elements beyond the head. Access to any elements of a vector take only "effectively constant time," as defined below. It's a larger constant than for access to the head of a list or for reading an element of an array, but it's a constant nonetheless. As a result, algorithms using vectors do not have to be careful about accessing just the head of the sequence. They can access and modify elements at arbitrary locations, and thus they can be much more convenient to write.
Vectors are built and modified just like any other sequence:
scala> val vec = scala.collection.immutable.Vector.empty vec: scala.collection.immutable.Vector[Nothing] = Vector()
scala> val vec2 = vec :+ 1 :+ 2 vec2: scala.collection.immutable.Vector[Int] = Vector(1, 2)
scala> val vec3 = 100 +: vec2 vec3: scala.collection.immutable.Vector[Int] = Vector(100, 1, 2)
scala> vec3(0) res23: Int = 100
Vectors are represented as broad, shallow trees. Every tree node contains up to 32 elements of the vector or contains up to 32 other tree nodes. Vectors with up to 32 elements can be represented in a single node. Vectors with up to 32 * 32 = 1024 elements can be represented with a single indirection. Two hops from the root of the tree to the final element node are sufficient for vectors with up to 215 elements, three hops for vectors with 220, four hops for vectors with 225 elements and five hops for vectors with up to 230 elements. So for all vectors of reasonable size, an element selection involves up to five primitive array selections. This is what we meant when we wrote that element access is "effectively constant time."
Vectors are immutable, so you cannot change an element of a vector in place. However, with the updated method you can create a new vector that differs from a given vector only in a single element:
scala> val vec = Vector(1, 2, 3) vec: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)
scala> vec updated (2, 4) res24: scala.collection.immutable.Vector[Int] = Vector(1, 2, 4)
scala> vec res25: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)
As the last line above shows, a call to updated has no effect on the original vector vec. Like selection, functional vector updates are also "effectively constant time." Updating an element in the middle of a vector can be done by copying the node that contains the element, and every node that points to it, starting from the root of the tree. This means that a functional update creates between one and five nodes that each contain up to 32 elements or subtrees. This is certainly more expensive than an in-place update in a mutable array, but still a lot cheaper than copying the whole vector.
Because vectors strike a good balance between fast random selections and fast random functional updates, they are currently the default implementation of immutable indexed sequences:
scala> collection.immutable.IndexedSeq(1, 2, 3) res26: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3)
If you need a last-in-first-out sequence, you can use a Stack. You push an element onto a stack with push, pop an element with pop, and peek at the top of the stack without removing it with top. All of these operations are constant time.
Here are some simple operations performed on a stack:
scala> val stack = scala.collection.immutable.Stack.empty stack: scala.collection.immutable.Stack[Nothing] = Stack()
scala> val hasOne = stack.push(1) hasOne: scala.collection.immutable.Stack[Int] = Stack(1)
scala> stack res27: scala.collection.immutable.Stack[Nothing] = Stack()
scala> hasOne.top res28: Int = 1
scala> hasOne.pop res29: scala.collection.immutable.Stack[Int] = Stack()
Immutable stacks are used rarely in Scala programs because their functionality is subsumed by lists: A push on an immutable stack is the same as a :: on a list, and a pop on a stack is the same a tail on a list.
A queue is just like a stack except that it is first-in-first-out rather than last-in-first-out. A simplified implementation of immutable queues was discussed in Chapter 19. Here's how you can create an empty immutable queue:
scala> val empty = scala.collection.immutable.Queue[Int]() empty: scala.collection.immutable.Queue[Int] = Queue()
You can append an element to an immutable queue with enqueue:
scala> val has1 = empty.enqueue(1) has1: scala.collection.immutable.Queue[Int] = Queue(1)
To append multiple elements to a queue, call enqueue with a collection as its argument:
scala> val has123 = has1.enqueue(List(2, 3)) has123: scala.collection.immutable.Queue[Int] = Queue(1, 2, 3)
To remove an element from the head of the queue, use dequeue:
scala> val (element, has23) = has123.dequeue element: Int = 1 has23: scala.collection.immutable.Queue[Int] = Queue(2, 3)
Note that dequeue returns a pair consisting of the element removed and the rest of the queue.
A range is an ordered sequence of integers that are equally spaced apart. For example, "1, 2, 3" is a range, as is "5, 8, 11, 14." To create a range in Scala, use the predefined methods to and by. Here are some examples:
scala> 1 to 3 res30: scala.collection.immutable.Range.Inclusive with scala.collection.immutable.Range.ByOne = Range(1, 2, 3)
scala> 5 to 14 by 3 res31: scala.collection.immutable.Range = Range(5, 8, 11, 14)
If you want to create a range that is exclusive of its upper limit, use the convenience method until instead of to:
scala> 1 until 3 res32: scala.collection.immutable.Range with scala.collection.immutable.Range.ByOne = Range(1, 2)
Ranges are represented in constant space, because they can be defined by just three numbers: their start, their end, and the stepping value. Because of this representation, most operations on ranges are extremely fast.
Hash tries[4] are a standard way to implement immutable sets and maps efficiently. Their representation is similar to vectors in that they are also trees where every node has 32 elements or 32 subtrees, but selection is done based on a hash code. For instance, to find a given key in a map, you use the lowest five bits of the hash code of the key to select the first subtree, the next five bits the next subtree, and so on. Selection stops once all elements stored in a node have hash codes that differ from each other in the bits that are selected so far. Thus, not all the bits of the hash code are necessarily used.
Hash tries strike a nice balance between reasonably fast lookups and reasonably efficient functional insertions (+) and deletions (-). That's why they underlie Scala's default implementations of immutable maps and sets. In fact, Scala has a further optimization for immutable sets and maps that contain less than five elements. Sets and maps with one to four elements are stored as single objects that just contain the elements (or key/value pairs in the case of a map) as fields. The empty immutable set and empty immutable map is in each case a singleton object—there's no need to duplicate storage for those because an empty immutable set or map will always stay empty.
Red-black trees are a form of balanced binary trees where some nodes are designated "red" and others "black." Like any balanced binary tree, operations on them reliably complete in time logarithmic to the size of the tree.
Scala provides implementations of sets and maps that use a red-black tree internally. You access them under the names TreeSet and TreeMap:
scala> val set = collection.immutable.TreeSet.empty[Int] set: scala.collection.immutable.TreeSet[Int] = TreeSet()
scala> set + 1 + 3 + 3 res33: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 3)
Red-black trees are also the standard implementation of SortedSet in Scala, because they provide an efficient iterator that returns all elements of the set in sorted order.
A bit set represents a collection of small integers as the bits of a larger integer. For example, the bit set containing 3, 2, and 0 would be represented as the integer 1101 in binary, which is 13 in decimal.
Internally, bit sets use an array of 64-bit Longs. The first Long in the array is for integers 0 through 63, the second is for 64 through 127, and so on. Thus, bit sets are very compact so long as the largest integer in the set is less than a few hundred or so.
Operations on bit sets are very fast. Testing for inclusion takes constant time. Adding an item to the set takes time proportional to the number of Longs in the bit set's array, which is typically a small number. Here are some simple examples of the use of a bit set:
scala> val bits = scala.collection.immutable.BitSet.empty bits: scala.collection.immutable.BitSet = BitSet()
scala> val moreBits = bits + 3 + 4 + 4 moreBits: scala.collection.immutable.BitSet = BitSet(3, 4)
scala> moreBits(3) res34: Boolean = true
scala> moreBits(0) res35: Boolean = false
A list map represents a map as a linked list of key-value pairs. In general, operations on a list map might have to iterate through the entire list. Thus, operations on a list map take time linear in the size of the map. In fact there is little usage for list maps in Scala because standard immutable maps are almost always faster. The only possible difference is if the map is for some reason constructed in such a way that the first elements in the list are selected much more often than the other elements.
scala> val map = collection.immutable.ListMap( 1 -> "one", 2 -> "two") map: scala.collection.immutable.ListMap[Int,java.lang.String] = Map((1,one), (2,two))
scala> map(2) res36: java.lang.String = two
Now that you've seen the most commonly used immutable collection classes that Scala provides in its standard library, take a look at the mutable collection classes.
You've already seen array buffers in Section 17.1. An array buffer holds an array and a size. Most operations on an array buffer have the same speed as an array, because the operations simply access and modify the underlying array. Additionally, array buffers can have data efficiently added to the end. Appending an item to an array buffer takes amortized constant time. Thus, array buffers are useful for efficiently building up a large collection whenever the new items are always added to the end. Here are some examples:
scala> val buf = collection.mutable.ArrayBuffer.empty[Int] buf: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer()
scala> buf += 1 res37: buf.type = ArrayBuffer(1)
scala> buf += 10 res38: buf.type = ArrayBuffer(1, 10)
scala> buf.toArray res39: Array[Int] = Array(1, 10)
You've also already seen list buffers in Section 17.1. A list buffer is like an array buffer except that it uses a linked list internally instead of an array. If you plan to convert the buffer to a list once it is built up, use a list buffer instead of an array buffer. Here's an example:[5]
scala> val buf = collection.mutable.ListBuffer.empty[Int] buf: scala.collection.mutable.ListBuffer[Int] = ListBuffer()
scala> buf += 1 res40: buf.type = ListBuffer(1)
scala> buf += 10 res41: buf.type = ListBuffer(1, 10)
scala> buf.toList res42: List[Int] = List(1, 10)
Just like an array buffer is useful for building arrays, and a list buffer is useful for building lists, a string builder is useful for building strings. String builders are so commonly used that they are already imported into the default namespace. Create them with a simple new StringBuilder, like this:
scala> val buf = new StringBuilder buf: StringBuilder = StringBuilder()
scala> buf += 'a' res43: buf.type = StringBuilder(a)
scala> buf ++= "bcdef" res44: buf.type = StringBuilder(a, b, c, d, e, f)
scala> buf.toString res45: String = abcdef
Linked lists are mutable sequences that consist of nodes that are linked with next pointers. In most languages null would be picked as the empty linked list. That does not work for Scala collections, because even empty sequences must support all sequence methods. LinkedList.empty.isEmpty, in particular, should return true and not throw a NullPointerException. Empty linked lists are encoded instead in a special way: Their next field points back to the node itself.
Like their immutable cousins, linked lists are best operated on sequentially. In addition, linked lists make it easy to insert an element or linked list into another linked list.
DoubleLinkedLists are like the single linked lists described in the previous subsection, except besides next, they have another mutable field, prev, that points to the element preceding the current node. The main benefit of that additional link is that it makes element removal very fast.
A MutableList consists of a single linked list together with a pointer that refers to the terminal empty node of that list. This makes list append a constant time operation because it avoids having to traverse the list in search for its terminal node. MutableList is currently the standard implementation of mutable.LinearSeq in Scala.
Scala provides mutable queues in addition to immutable ones. You use a mutable queue similarly to the way you use an immutable one, but instead of enqueue, you use the += and ++= operators to append. Also, on a mutable queue, the dequeue method will just remove the head element from the queue and return it. Here's an example:
scala> val queue = new scala.collection.mutable.Queue[String] queue: scala.collection.mutable.Queue[String] = Queue()
scala> queue += "a" res46: queue.type = Queue(a)
scala> queue ++= List("b", "c") res47: queue.type = Queue(a, b, c)
scala> queue res48: scala.collection.mutable.Queue[String] = Queue(a, b, c)
scala> queue.dequeue res49: String = a
scala> queue res50: scala.collection.mutable.Queue[String] = Queue(b, c)
Array sequences are mutable sequences of fixed size that store their elements internally in an Array[AnyRef]. They are implemented in Scala by class ArraySeq.
You would typically use an ArraySeq if you want an array for its performance characteristics, but you also want to create generic instances of the sequence where you do not know the type of the elements and do not have a ClassManifest to provide it at run-time. You will find out about these issues with arrays shortly, in Section 24.11.
You saw immutable stacks earlier. There is also a mutable version. It works exactly the same as the immutable version except that modifications happen in place. Here's an example:
scala> val stack = new scala.collection.mutable.Stack[Int] stack: scala.collection.mutable.Stack[Int] = Stack()
scala> stack.push(1) res51: stack.type = Stack(1)
scala> stack res52: scala.collection.mutable.Stack[Int] = Stack(1)
scala> stack.push(2) res53: stack.type = Stack(2, 1)
scala> stack res54: scala.collection.mutable.Stack[Int] = Stack(2, 1)
scala> stack.top res55: Int = 2
scala> stack res56: scala.collection.mutable.Stack[Int] = Stack(2, 1)
scala> stack.pop res57: Int = 2
scala> stack res58: scala.collection.mutable.Stack[Int] = Stack(1)
ArrayStack is an alternative implementation of a mutable stack, which is backed by an Array that gets resized as needed. It provides fast indexing and is generally slightly more efficient for most operations than a normal mutable stack.
A hash table stores its elements in an underlying array, placing each item at a position in the array determined by the hash code of that item. Adding an element to a hash table takes only constant time, so long as there isn't already another element in the array that has the same hash code. Hash tables are thus very fast so long as the objects placed in them have a good distribution of hash codes. As a result, the default mutable map and set types in Scala are based on hash tables.
Hash sets and maps are used just like any other set or map. Here are some simple examples:
scala> val map = collection.mutable.HashMap.empty[Int,String] map: scala.collection.mutable.HashMap[Int,String] = Map()
scala> map += (1 -> "make a web site") res59: map.type = Map((1,make a web site))
scala> map += (3 -> "profit!") res60: map.type = Map((1,make a web site), (3,profit!))
scala> map(1) res61: String = make a web site
scala> map contains 2 res62: Boolean = false
Iteration over a hash table is not guaranteed to occur in any particular order. Iteration simply proceeds through the underlying array in whichever order it happens to be. To get a guaranteed iteration order, use a linked hash map or set instead of a regular one. A linked hash map or set is just like a regular hash map or set except that it also includes a linked list of the elements in the order they were added. Iteration over such a collection is always in the same order that the elements were initially added.
A weak hash map is a special kind of hash map in which the garbage collector does not follow links from the map to the keys stored in it. This means that a key and its associated value will disappear from the map if there is no other reference to that key. Weak hash maps are useful for tasks such as caching, where you want to re-use an expensive function's result if the function is called again on the same key. If keys and function results are stored in a regular hash map, the map could grow without bounds, and no key would ever become garbage. Using a weak hash map avoids this problem. As soon as a key object becomes unreachable, it's entry is removed from the weak hash map. Weak hash maps in Scala are implemented as a wrapper of an underlying Java implementation, java.util.WeakHashMap.
A concurrent map can be accessed by several threads at once. In addition to the usual Map operations, it provides the following atomic operations:
ConcurrentMap is a trait in the Scala collections library. Currently, its only implementation is Java's java.util.concurrent.ConcurrentMap, which can be converted automatically into a Scala map using the standard Java/Scala collection conversions, which will be described in Section 24.18.
A mutable bit set is just like an immutable one, except that it can be modified in place. Mutable bit sets are slightly more efficient at updating than immutable ones, because they don't have to copy around Longs that haven't changed. Here is an example:
scala> val bits = scala.collection.mutable.BitSet.empty bits: scala.collection.mutable.BitSet = BitSet()
scala> bits += 1 res63: bits.type = BitSet(1)
scala> bits += 3 res64: bits.type = BitSet(1, 3)
scala> bits res65: scala.collection.mutable.BitSet = BitSet(1, 3)
Arrays are a special kind of collection in Scala. One the one hand, Scala arrays correspond one-to-one to Java arrays. That is, a Scala array Array[Int] is represented as a Java int[], an Array[Double] is represented as a Java double[] and an Array[String] is represented as a Java String[]. But at the same time, Scala arrays offer much more their Java analogues. First, Scala arrays can be generic. That is, you can have an Array[T], where T is a type parameter or abstract type. Second, Scala arrays are compatible with Scala sequences—you can pass an Array[T] where a Seq[T] is required. Finally, Scala arrays also support all sequence operations. Here's an example of this in action:
scala> val a1 = Array(1, 2, 3) a1: Array[Int] = Array(1, 2, 3)
scala> val a2 = a1 map (_ * 3) a2: Array[Int] = Array(3, 6, 9)
scala> val a3 = a2 filter (_ % 2 != 0) a3: Array[Int] = Array(3, 9)
scala> a3.reverse res1: Array[Int] = Array(9, 3)
Given that Scala arrays are represented just like Java arrays, how can these additional features be supported in Scala? In fact, the answer to this question differs between Scala 2.8 and earlier versions. Previously, the Scala compiler somewhat "magically" wrapped and unwrapped arrays to and from Seq objects, when required, in a process called boxing and unboxing. The details of this were quite complicated, in particular when you created a new array of generic type Array[T]. There were some puzzling corner cases and the performance of array operations was not all that predictable.
The Scala 2.8 design is much simpler. Almost all compiler magic is gone. Instead the Scala 2.8 array implementation makes systematic use of implicit conversions. In Scala 2.8 an array does not pretend to be a sequence. It can't really be that because the data type representation of a native array is not a subtype of Seq. Instead there is an implicit "wrapping" conversion between arrays and instances of class scala.collection.mutable.WrappedArray, which is a subclass of Seq. Here you see it in action:
scala> val seq: Seq[Int] = a1 seq: Seq[Int] = WrappedArray(1, 2, 3)
scala> val a4: Array[Int] = seq.toArray a4: Array[Int] = Array(1, 2, 3)
scala> a1 eq a4 res2: Boolean = true
This interaction demonstrates that arrays are compatible with sequences, because there's an implicit conversion from Array to WrappedArray. To go the other way, from a WrappedArray to an Array, you can use the toArray method defined in Traversable. The last interpreter line above shows that wrapping then unwrapping with toArray gives you back the same array you started with.
There is yet another implicit conversion that gets applied to arrays. This conversion simply "adds" all sequence methods to arrays but does not turn the array itself into a sequence. "Adding" means that the array is wrapped in another object of type ArrayOps, which supports all sequence methods. Typically, this ArrayOps object is short-lived; it will usually be inaccessible after the call to the sequence method and its storage can be recycled. Modern VMs often avoid creating this object entirely.
The difference between the two implicit conversions on arrays is demonstrated here:
scala> val seq: Seq[Int] = a1 seq: Seq[Int] = WrappedArray(1, 2, 3)
scala> seq.reverse res2: Seq[Int] = WrappedArray(3, 2, 1)
scala> val ops: collection.mutable.ArrayOps[Int] = a1 ops: scala.collection.mutable.ArrayOps[Int] = [I(1, 2, 3)
scala> ops.reverse res3: Array[Int] = Array(3, 2, 1)
You see that calling reverse on seq, which is a WrappedArray, will give again a WrappedArray. That's logical, because wrapped arrays are Seqs, and calling reverse on any Seq will give again a Seq. On the other hand, calling reverse on the ops value of class ArrayOps will result in an Array, not a Seq.
The ArrayOps example above was quite artificial, intended only to show the difference to WrappedArray. Normally, you'd never define a value of class ArrayOps. You'd just call a Seq method on an array:
scala> a1.reverse
res4: Array[Int] = Array(3, 2, 1)
The ArrayOps object gets inserted automatically by the implicit conversion. So the line above is equivalent to the following line, where intArrayOps was the conversion that was implicitly inserted previously:
scala> intArrayOps(a1).reverse
res5: Array[Int] = Array(3, 2, 1)
This raises the question how the compiler picked intArrayOps over the other implicit conversion to WrappedArray in the line above. After all, both conversions map an array to a type that supports a reverse method, which is what the input specified. The answer to that question is that the two implicit conversions are prioritized. The ArrayOps conversion has a higher priority than the WrappedArray conversion. The first is defined in the Predef object whereas the second is defined in a class scala.LowPriorityImplicits, which is a superclass of Predef. Implicits in subclasses and subobjects take precedence over implicits in base classes. So if both conversions are applicable, the one in Predef is chosen. A very similar scheme, which was described in Section 21.7, works for strings.
So now you know how arrays can be compatible with sequences and how they can support all sequence operations. What about genericity? In Java you cannot write a T[] where T is a type parameter. How then is Scala's Array[T] represented? In fact a generic array like Array[T] could be at run-time any of Java's eight primitive array types byte[], short[], char[], int[], long[], float[], double[], boolean[], or it could be an array of objects. The only common run-time type encompassing all of these types is AnyRef (or, equivalently java.lang.Object), so that's the type to which the Scala compiler maps Array[T]. At run-time, when an element of an array of type Array[T] is accessed or updated there is a sequence of type tests that determine the actual array type, followed by the correct array operation on the Java array. These type tests slow down array operations somewhat. You can expect accesses to generic arrays to be three to four times slower than accesses to primitive or object arrays. This means that if you need maximal performance, you should prefer concrete over generic arrays.
Representing the generic array type is not enough, however, there must also be a way to create generic arrays. This is an even harder problem, which requires a little bit of help from you. To illustrate the problem, consider the following attempt to write a generic method that creates an array:
// This is wrong! def evenElems[T](xs: Vector[T]): Array[T] = { val arr = new Array[T]((xs.length + 1) / 2) for (i <- 0 until xs.length by 2) arr(i / 2) = xs(i) arr }
The evenElems method returns a new array that consists of all elements of the argument vector xs that are at even positions in the vector. The first line of the body of evenElems creates the result array, which has the same element type as the argument. So depending on the actual type parameter for T, this could be an Array[Int], or an Array[Boolean], or an array of some of the other primitive types in Java, or an array of some reference type. But these types all have different runtime representations, so how is the Scala runtime going to pick the correct one? In fact, it can't do that based on the information it is given, because the actual type that corresponds to the type parameter T is erased at runtime. That's why you will get the following error message if you attempt to compile the code above:
error: cannot find class manifest for element type T val arr = new Array[T]((arr.length + 1) / 2) ^
What's required here is that you help the compiler by providing a runtime hint of what the actual type parameter of evenElems is. This runtime hint takes the form of a class manifest of type scala.reflect.ClassManifest. A class manifest is a type descriptor object that describes what the top-level class of a type is. Alternatively to class manifests there are also full manifests of type scala.reflect.Manifest, which describe all aspects of a type. But for array creation, only class manifests are needed.
The Scala compiler will generate code to construct and pass class manifests automatically if you instruct it to do so. "Instructing" means that you demand a class manifest as an implicit parameter, like this:
def evenElems[T](xs: Vector[T]) (implicit m: ClassManifest[T]): Array[T] = ...
Using an alternative and shorter syntax, you can also demand that the type comes with a class manifest by using a context bound. This means following the type with a colon and the class name ClassManifest, like this:
// This works def evenElems[T: ClassManifest](xs: Vector[T]): Array[T] = { val arr = new Array[T]((xs.length + 1) / 2) for (i <- 0 until xs.length by 2) arr(i / 2) = xs(i) arr }
The two revised versions of evenElems mean exactly the same. What happens in either case is that when the Array[T] is constructed, the compiler will look for a class manifest for the type parameter T, that is, it will look for an implicit value of type ClassManifest[T]. If such a value is found, the manifest is used to construct the right kind of array. Otherwise, you'll see an error message like the one shown previously.
Here is an interpreter interaction that uses the evenElems method:
scala> evenElems(Vector(1, 2, 3, 4, 5)) res6: Array[Int] = Array(1, 3, 5)
scala> evenElems(Vector("this", "is", "a", "test", "run")) res7: Array[java.lang.String] = Array(this, a, run)
In both cases, the Scala compiler automatically constructed a class manifest for the element type (first Int, then String) and passed it to the implicit parameter of the evenElems method. The compiler can do that for all concrete types, but not if the argument is itself another type parameter without its class manifest. For instance, the following fails:
scala> def wrap[U](xs: Vector[U]) = evenElems(xs) <console>:6: error: could not find implicit value for evidence parameter of type ClassManifest[U] def wrap[U](xs: Vector[U]) = evenElems(xs) ^
What happened here is that the evenElems demands a class manifest for the type parameter U, but none was found. The solution in this case is, of course, to demand another implicit class manifest for U. So the following works:
scala> def wrap[U: ClassManifest](xs: Vector[U]) = evenElems(xs) wrap: [U](xs: Vector[U])(implicit evidence$1: ClassManifest[U])Array[U]
This example also shows that the context bound in the definition of U is just a shorthand for an implicit parameter named here evidence$1 of type ClassManifest[U].
In summary, generic array creation demands class manifests. Whenever you create an array of a type parameter T, you also need to provide an implicit class manifest for T. The easiest way to do this is to declare the type parameter with a ClassManifest context bound, as in [T: ClassManifest].
Like arrays, strings are not directly sequences, but they can be converted to them, and they also support all sequence operations. Here are some examples of operations you can invoke on strings:
scala> val str = "hello" str: java.lang.String = hello
scala> str.reverse res6: String = olleh
scala> str.map(_.toUpper) res7: String = HELLO
scala> str drop 3 res8: String = lo
scala> str slice (1, 4) res9: String = ell
scala> val s: Seq[Char] = str s: Seq[Char] = WrappedString(h, e, l, l, o)
These operations are supported by two implicit conversions, which were explained in Section 21.7. The first, low-priority conversion maps a String to a WrappedString, which is a subclass of immutable.IndexedSeq. This conversion was applied in the last line of the previous example in which a string was converted into a Seq. The other, high-priority conversion maps a string to a StringOps object, which adds all methods on immutable sequences to strings. This conversion was implicitly inserted in the method calls of reverse, map, drop, and slice in the previous example.
As the previous explanations have shown, different collection types have different performance characteristics. That's often the primary reason for picking one collection type over another. You can see the performance characteristics of some common operations on collections summarized in two tables, Table 24.13 and Table 24.13.
The entries in these two tables are explained as follows:
C | The operation takes (fast) constant time. |
eC | The operation takes effectively constant time, but this might depend on some assumptions such as the maximum length of a vector or the distribution of hash keys. |
aC | The operation takes amortized constant time. Some invocations of the operation might take longer, but if many operations are performed on average only constant time per operation is taken. |
Log | The operation takes time proportional to the logarithm of the collection size. |
L | The operation is linear, that is it takes time proportional to the collection size. |
- | The operation is not supported. |
Table 24.13 treats sequence types—both immutable and mutable—with the following operations:
head | Selecting the first element of the sequence. |
tail | Producing a new sequence that consists of all elements except the first one. |
apply | Indexing. |
update | Functional update (with updated) for immutable sequences, side-effecting update (with update) for mutable sequences. |
prepend | Adding an element to the front of the sequence. For immutable sequences, this produces a new sequence. For mutable sequences it modifies the existing sequence. |
append | Adding an element at the end of the sequence. For immutable sequences, this produces a new sequence. For mutable sequences it modifies the existing sequence. |
insert | Inserting an element at an arbitrary position in the sequence. This is only supported directly for mutable sequences. |
Table 24.13 treats mutable and immutable sets and maps with the following operations:
lookup | Testing whether an element is contained in set, or selecting a value associated with a key. |
add | Adding a new element to a set or a new key/value pair to a map. |
remove | Removing an element from a set or a key from a map. |
min | The smallest element of the set, or the smallest key of a map. |
head | tail | apply | update | prepend | append | insert | |
immutable | |||||||
List | C | C | L | L | C | L | - |
Stream | C | C | L | L | C | L | - |
Vector | eC | eC | eC | eC | eC | eC | - |
Stack | C | C | L | L | C | L | - |
Queue | aC | aC | L | L | L | C | - |
Range | C | C | C | - | - | - | - |
String | C | L | C | L | L | L | - |
mutable | |||||||
ArrayBuffer | C | L | C | C | L | aC | L |
ListBuffer | C | L | L | L | C | C | L |
StringBuilder | C | L | C | C | L | aC | L |
MutableList | C | L | L | L | C | C | L |
Queue | C | L | L | L | C | C | L |
ArraySeq | C | L | C | C | - | - | - |
Stack | C | L | L | L | C | L | L |
ArrayStack | C | L | C | C | aC | L | L |
Array | C | L | C | C | - | - | - |
lookup | add | remove | min | |
immutable | ||||
HashSet/HashMap | eC | eC | eC | L |
TreeSet/TreeMap | Log | Log | Log | Log |
BitSet | C | L | L | eC[6] |
ListMap | L | L | L | L |
mutable | ||||
HashSet/HashMap | eC | eC | eC | L |
WeakHashMap | eC | eC | eC | L |
BitSet | C | aC | C | eC^a |
The collection libraries have a uniform approach to equality and hashing. The idea is, first, to divide collections into sets, maps, and sequences. Collections in different categories are always unequal. For instance, Set(1, 2, 3) is unequal to List(1, 2, 3) even though they contain the same elements. On the other hand, within the same category, collections are equal if and only if they have the same elements (for sequences: the same elements in the same order). For example, List(1, 2, 3) == Vector(1, 2, 3), and HashSet(1, 2) == TreeSet(2, 1).
It does not matter for the equality check whether a collection is mutable or immutable. For a mutable collection, equality simply depends on the current elements at the time the equality test is performed. This means that a mutable collection might be equal to different collections at different times, depending what elements are added or removed. This is a potential trap when using a mutable collection as a key in a hash map. For example:
scala> import collection.mutable.{HashMap, ArrayBuffer} import collection.mutable.{HashMap, ArrayBuffer}
scala> val buf = ArrayBuffer(1, 2, 3) buf: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3)
scala> val map = HashMap(buf -> 3) map: scala.collection.mutable.HashMap[scala.collection. mutable.ArrayBuffer[Int],Int] = Map((ArrayBuffer(1, 2, 3),3))
scala> map(buf) res13: Int = 3
scala> buf(0) += 1
scala> map(buf) java.util.NoSuchElementException: key not found: ArrayBuffer(2, 2, 3)
In this example, the selection in the last line will most likely fail because the hash code of the array xs has changed in the second-to-last line. Therefore, the hash-code-based lookup will look at a different place than the one in which xs was stored.
Collections have quite a few methods that construct new collections. Some examples are map, filter, and ++. We call such methods transformers because they take at least one collection as their receiver object and produce another collection in their result.
Transformers can be implemented in two principal ways: strict and non-strict (or lazy). A strict transformer constructs a new collection with all of its elements. A non-strict, or lazy, transformer constructs only a proxy for the result collection, and its elements are constructed on demand.
As an example of a non-strict transformer, consider the following implementation of a lazy map operation:
def lazyMap[T, U](coll: Iterable[T], f: T => U) = new Iterable[U] { def iterator = coll.iterator map f }
Note that lazyMap constructs a new Iterable without stepping through all elements of the given collection coll. The given function f is instead applied to the elements of the new collection's iterator as they are demanded.
Scala collections are by default strict in all their transformers, except for Stream, which implements all its transformer methods lazily. However, there is a systematic way to turn every collection into a lazy one and vice versa, which is based on collection views. A view is a special kind of collection that represents some base collection, but implements all of its transformers lazily.
To go from a collection to its view, you can use the view method on the collection. If xs is some collection, then xs.view is the same collection, but with all transformers implemented lazily. To get back from a view to a strict collection, you can use the force method.
As an example, say you have a vector of Ints over which you want to map two functions in succession:
scala> val v = Vector(1 to 10: _*) v: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> v map (_ + 1) map (_ * 2) res5: scala.collection.immutable.Vector[Int] = Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22)
In the last statement, the expression v map (_ + 1) constructs a new vector that is then transformed into a third vector by the second call to map (_ * 2). In many situations, constructing the intermediate result from the first call to map is a bit wasteful. In the pseudo example, it would be faster to do a single map with the composition of the two functions (_ + 1) and (_ * 2). If you have the two functions available in the same place you can do this by hand. But quite often, successive transformations of a data structure are done in different program modules. Fusing those transformations would then undermine modularity. A more general way to avoid the intermediate results is by turning the vector first into a view, applying all transformations to the view, and finally forcing the view to a vector:
scala> (v.view map (_ + 1) map (_ * 2)).force res12: Seq[Int] = Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22)
We'll do this sequence of operations again, one by one:
scala> val vv = v.view vv: scala.collection.SeqView[Int,Vector[Int]] = SeqView(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
The application v.view gives you a SeqView, i.e., a lazily evaluated Seq. The type SeqView has two type parameters. The first, Int, shows the type of the view's elements. The second, Vector[Int], shows you the type constructor you get back when forcing the view.
Applying the first map to the view gives you:
scala> vv map (_ + 1) res13: scala.collection.SeqView[Int,Seq[_]] = SeqViewM(...)
The result of the map is a value that prints SeqViewM(...). This is in essence a wrapper that records the fact that a map with function (_ + 1) needs to be applied on the vector v. It does not apply that map until the view is forced, however. The "M" after SeqView is an indication that the view encapsulates a map operation. Other letters indicate other delayed operations. For instance "S" indicates a delayed slice operation, and "R" indicates a reverse. We'll now apply the second map to the last result.
scala> res13 map (_ * 2) res14: scala.collection.SeqView[Int,Seq[_]] = SeqViewMM(...)
You now get a SeqView that contains two map operations, so it prints with a double "M": SeqViewMM(...). Finally, forcing the last result gives:
scala> res14.force
res15: Seq[Int] = Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22)
Both stored functions get applied as part of the execution of the force operation and a new vector is constructed. That way, no intermediate data structure is needed.
One detail to note is that the static type of the final result is a Seq, not a Vector. Tracing the types back we see that as soon as the first delayed map was applied, the result had static type SeqViewM[Int, Seq[_]]. That is, the "knowledge" that the view was applied to the specific sequence type Vector got lost. The implementation of a view for any particular class requires quite a bit of code, so the Scala collection libraries provide views mostly only for general collection types, not for specific implementations.[7]
There are two reasons why you might want to consider using views. The first is performance. You have seen that by switching a collection to a view the construction of intermediate results can be avoided. These savings can be quite important. As another example, consider the problem of finding the first palindrome in a list of words. A palindrome is a word that reads backwards the same as forwards. Here are the necessary definitions:
def isPalindrome(x: String) = x == x.reverse def findPalindrome(s: Seq[String]) = s find isPalindrome
Now, assume you have a very long sequence words and you want to find a palindrome in the first million words of that sequence. Can you re-use the definition of findPalindrome? Of course, you could write:
findPalindrome(words take 1000000)
This nicely separates the two aspects of taking the first million words of a sequence and finding a palindrome in it. But the downside is that it always constructs an intermediary sequence consisting of one million words, even if the first word of that sequence is already a palindrome. So potentially, 999,999 words are copied into the intermediary result without being inspected at all afterwards. Many programmers would give up here and write their own specialized version of finding palindromes in some given prefix of an argument sequence. But with views, you don't have to. Simply write:
findPalindrome(words.view take 1000000)
This has the same nice separation of concerns, but instead of a sequence of a million elements it will only construct a single lightweight view object. This way, you do not need to choose between performance and modularity.
The second use case applies to views over mutable sequences. Many transformer functions on such views provide a window into the original sequence that can then be used to update selectively some elements of that sequence. To see this in an example, suppose you have an array arr:
scala> val arr = (0 to 9).toArray arr: Array[Int] = Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
You can create a subwindow into that array by creating a slice of a view of the array:
scala> val subarr = arr.view.slice(3, 6) subarr: scala.collection.mutable.IndexedSeqView[ Int,Array[Int]] = IndexedSeqViewS(...)
This gives a view, subarr, which refers to the elements at positions 3 through 5 of the array arr. The view does not copy these elements, it just provides a reference to them. Now, assume you have a method that modifies some elements of a sequence. For instance, the following negate method would negate all elements of the sequence of integers it's given:
scala> def negate(xs: collection.mutable.Seq[Int]) = for (i <- 0 until xs.length) xs(i) = -xs(i) negate: (xs: scala.collection.mutable.Seq[Int])Unit
Assume now you want to negate elements at positions three through five of the array arr. Can you use negate for this? Using a view, this is simple:
scala> negate(subarr)
scala> arr res4: Array[Int] = Array(0, 1, 2, -3, -4, -5, 6, 7, 8, 9)
What happened here is that negate changed all elements of subarr, which were a slice of the elements of arr. Again, you see that views help in keeping things modular. The code above nicely separated the question of what index range to apply a method to from the question what method to apply.
After having seen all these nifty uses of views you might wonder why have strict collections at all? One reason is that performance comparisons do not always favor lazy over strict collections. For smaller collection sizes the added overhead of forming and applying closures in views is often greater than the gain from avoiding the intermediary data structures. A possibly more important reason is that evaluation in views can be very confusing if the delayed operations have side effects.
Here's an example that bit a few users of versions of Scala before 2.8. In these versions the Range type was lazy, so it behaved in effect like a view. People were trying to create a number of actors[8] like this:
val actors = for (i <- 1 to 10) yield actor { ... }
They were surprised that none of the actors were executing afterwards, even though the actor method should create and start an actor from the code that's enclosed in the braces following it. To explain why nothing happened, remember that the for expression above is equivalent to an application of the map method:
val actors = (1 to 10) map (i => actor { ... })
Since previously the range produced by (1 to 10) behaved like a view, the result of the map was again a view. That is, no element was computed, and, consequently, no actor was created! Actors would have been created by forcing the range of the whole expression, but it's far from obvious that this is what was required to make the actors do their work.
To avoid surprises like this, the Scala 2.8 collections library has more regular rules. All collections except streams and views are strict. The only way to go from a strict to a lazy collection is via the view method. The only way to go back is via force. So the actors definition above would behave as expected in Scala 2.8 in that it would create and start ten actors. To get back the surprising previous behavior, you'd have to add an explicit view method call:
val actors = for (i <- (1 to 10).view) yield actor { ... }
In summary, views are a powerful tool to reconcile concerns of efficiency with concerns of modularity. But in order not to be entangled in aspects of delayed evaluation, you should restrict views to two scenarios. Either you apply views in purely functional code where collection transformations do not have side effects. Or you apply them over mutable collections where all modifications are done explicitly. What's best avoided is a mixture of views and operations that create new collections while also having side effects.
An iterator is not a collection, but rather a way to access the elements of a collection one by one. The two basic operations on an iterator it are next and hasNext. A call to it.next() will return the next element of the iterator and advance the state of the iterator. Calling next again on the same iterator will then yield the element one beyond the one returned previously. If there are no more elements to return, a call to next will throw a NoSuchElementException. You can find out whether there are more elements to return using Iterator's hasNext method.
The most straightforward way to "step through" all the elements returned by an iterator is to use a while loop:
while (it.hasNext)
println(it.next())
Iterators in Scala also provide analogues of most of the methods that you find in the Traversable, Iterable, and Seq traits. For instance, they provide a foreach method that executes a given procedure on each element returned by an iterator. Using foreach, the loop above could be abbreviated to:
it foreach println
As always, for expressions can be used as an alternate syntax for expressions involving foreach, map, filter, and flatMap, so yet another way to print all elements returned by an iterator would be:
for (elem <- it) println(elem)
There's an important difference between the foreach method on iterators and the same method on traversable collections: When called on an iterator, foreach will leave the iterator at its end when it is done. So calling next again on the same iterator will fail with a NoSuchElementException. By contrast, when called on a collection, foreach leaves the number of elements in the collection unchanged (unless the passed function adds or removes elements, but this is discouraged, because it can easily lead to surprising results).
The other operations that Iterator has in common with Traversable have the same property of leaving the iterator at its end when done. For instance, iterators provide a map method, which returns a new iterator:
scala> val it = Iterator("a", "number", "of", "words") it: Iterator[java.lang.String] = non-empty iterator
scala> it.map(_.length) res1: Iterator[Int] = non-empty iterator
scala> res1 foreach println 1 6 2 5
scala> it.next() java.util.NoSuchElementException: next on empty iterator
As you can see, after the call to map, the it iterator has advanced to its end.
Another example is the dropWhile method, which can be used to find the first element of an iterator that has a certain property. For instance, to find the first word in the iterator shown previously that has at least two characters, you could write:
scala> val it = Iterator("a", "number", "of", "words") it: Iterator[java.lang.String] = non-empty iterator
scala> it dropWhile (_.length < 2) res4: Iterator[java.lang.String] = non-empty iterator
scala> it.next() res5: java.lang.String = number
Note again that it has changed by the call to dropWhile: it now points to the second word "number" in the list. In fact, it and the result res4 returned by dropWhile will return exactly the same sequence of elements.
There is only one standard operation, duplicate, which allows you to re-use the same iterator:
val (it1, it2) = it.duplicate
The call to duplicate gives you two iterators, which each return exactly the same elements as the iterator it. The two iterators work independently; advancing one does not affect the other. By contrast the original iterator, it, is advanced to its end by duplicate and is thus rendered unusable.
In summary, iterators behave like collections if you never access an iterator again after invoking a method on it. The Scala collection libraries make this explicit with an abstraction called TraversableOnce, which is a common supertrait of Traversable and Iterator. As the name implies, TraversableOnce objects can be traversed using foreach, but the state of that object after the traversal is not specified. If the TraversableOnce object is in fact an Iterator, it will be at its end after the traversal, but if it is a Traversable, it will still exist as before. A common use case of TraversableOnce is as an argument type for methods that can take either an iterator or traversable as argument. An example is the appending method ++ in trait Traversable. It takes a TraversableOnce parameter, so you can append elements coming from either an iterator or a traversable collection.
All operations on iterators are summarized in Table 24.12:
Sometimes you want an iterator that can "look ahead" so that you can inspect the next element to be returned without advancing past that element. Consider, for instance, the task to skip leading empty strings from an iterator that returns a sequence of strings. You might be tempted to write something like the following method:
// This won't work def skipEmptyWordsNOT(it: Iterator[String]) { while (it.next().isEmpty) {} }
But looking at this code more closely, it's clear that this is wrong: the code will indeed skip leading empty strings, but it will also advance it past the first non-empty string!
The solution to this problem is to use a buffered iterator, an instance of trait BufferedIterator. BufferedIterator is a subtrait of Iterator, which provides one extra method, head. Calling head on a buffered iterator will return its first element, but will not advance the iterator. Using a buffered iterator, skipping empty words can be written like this:
def skipEmptyWords(it: BufferedIterator[String]) = while (it.head.isEmpty) { it.next() }
Every iterator can be converted to a buffered iterator by calling its buffered method. Here's an example:
scala> val it = Iterator(1, 2, 3, 4) it: Iterator[Int] = non-empty iterator
scala> val bit = it.buffered bit: java.lang.Object with scala.collection. BufferedIterator[Int] = non-empty iterator
scala> bit.head res10: Int = 1
scala> bit.next() res11: Int = 1
scala> bit.next() res11: Int = 2
Note that calling head on the buffered iterator, bit, did not advance it. Therefore, the subsequent call, bit.next(), returned again the same value as bit.head.
You have already seen syntax like List(1, 2, 3), which creates a list of three integers, and Map('A' -> 1, 'C' -> 2), which creates a map with two bindings. This is actually a universal feature of Scala collections. You can take any collection name and follow it by a list of elements in parentheses. The result will be a new collection with the given elements. Here are some more examples:
Traversable() // An empty traversable object List() // The empty list List(1.0, 2.0) // A list with elements 1.0, 2.0 Vector(1.0, 2.0) // A vector with elements 1.0, 2.0 Iterator(1, 2, 3) // An iterator returning three integers. Set(dog, cat, bird) // A set of three animals HashSet(dog, cat, bird) // A hash set of the same animals Map('a' -> 7, 'b' -> 0) // A map from characters to integers
"Under the covers" each of the above lines is a call to the apply method of some object. For instance, the third line above expands to:
List.apply(1.0, 2.0)
So this is a call to the apply method of the companion object of the List class. That method takes an arbitrary number of arguments and constructs a list from them. Every collection class in the Scala library has a companion object with such an apply method. It does not matter whether the collection class represents a concrete implementation, like List, Stream, or Vector, or whether it is an trait such as Seq, Set, or Traversable. In the latter case, calling apply will produce some default implementation of the trait. Here are some examples:
scala> List(1, 2, 3) res17: List[Int] = List(1, 2, 3)
scala> Traversable(1, 2, 3) res18: Traversable[Int] = List(1, 2, 3)
scala> mutable.Traversable(1, 2, 3) res19: scala.collection.mutable.Traversable[Int] = ArrayBuffer(1, 2, 3)
Besides apply, every collection companion object also defines a member empty, which returns an empty collection. So instead of List() you could write List.empty, instead of Map(), Map.empty, and so on.
Descendants of Seq traits also provide other factory operations in their companion objects. These are summarized in Table 24.13. In short, there's:
Like Scala, Java has a rich collections library. There are many similarities between the two. For instance, both libraries know iterators, iterables, sets, maps, and sequences. But there are also important differences. In particular, the Scala libraries put much more emphasis on immutable collections, and provide many more operations that transform a collection into a new one.
Sometimes you might need to convert from one collection framework to the other. For instance, you might want to access to an existing Java collection, as if it were a Scala collection. Or you might want to pass one of Scala's collections to a Java method that expects the Java counterpart. It is quite easy to do this, because Scala offers implicit conversions between all the major collection types in the JavaConversions object. In particular, you will find bidirectional conversions between the following types:
Iterator Leftrightarrow java.util.Iterator Iterator Leftrightarrow java.util.Enumeration Iterable Leftrightarrow java.lang.Iterable Iterable Leftrightarrow java.util.Collection mutable.Buffer Leftrightarrow java.util.List mutable.Set Leftrightarrow java.util.Set mutable.Map Leftrightarrow java.util.Map
To enable these conversions, simply import like this:
scala> import collection.JavaConversions._ import collection.JavaConversions._
You have now automatic conversions between Scala collections and their corresponding Java collections.
scala> import collection.mutable._ import collection.mutable._
scala> val jul: java.util.List[Int] = ArrayBuffer(1, 2, 3) jul: java.util.List[Int] = [1, 2, 3]
scala> val buf: Seq[Int] = jul buf: scala.collection.mutable.Seq[Int] = ArrayBuffer(1, 2, 3)
scala> val m: java.util.Map[String, Int] = HashMap("abc" -> 1, "hello" -> 2) m: java.util.Map[String,Int] = {hello=2, abc=1}
Internally, these conversion work by setting up a "wrapper" object that forwards all operations to the underlying collection object. So collections are never copied when converting between Java and Scala. An interesting property is that if you do a round-trip conversion from, say, a Java type to its corresponding Scala type, and back to the same Java type, you end up with the identical collection object you started with.
Some other common Scala collections exist that can also be converted to Java types, but for which no corresponding conversion exists in the other direction. These are:
Seq Rightarrow java.util.List mutable.Seq Rightarrow java.util.List Set Rightarrow java.util.Set Map Rightarrow java.util.Map
Because Java does not distinguish between mutable and immutable collections in their type, a conversion from, say, collection.immutable.List will yield a java.util.List, on which all attempted mutation operations will throw an UnsupportedOperationException. Here's an example:
scala> val jul: java.util.List[Int] = List(1, 2, 3) jul: java.util.List[Int] = [1, 2, 3]
scala> jul.add(7) java.lang.UnsupportedOperationException at java.util.AbstractList.add(AbstractList.java:131)
If you have existing applications written in Scala 2.7, porting them to use the new collections should be almost automatic. There are only a couple of possible issues to take care of.
Generally, the old functionality of Scala 2.7 collections has been left in place. Some features have been deprecated, which means they will removed in some future release. You will get a deprecation warning when you compile code that makes use of these features in Scala 2.8. In a few places deprecation was unfeasible, because the operation in question was retained in 2.8, but changed in meaning or performance characteristics. These cases will be flagged with migration warnings when compiled under 2.8. To get full deprecation and migration warnings with suggestions how to change your code, pass the -deprecation and -Xmigration flags to scalac.[9] You can also pass the same options to the scala interpreter to get the warnings in an interactive session. Example:
>scala -deprecation -Xmigration Welcome to Scala version 2.8.1. Type in expressions to have them evaluated. Type :help for more information.
scala> val xs = List((1, 2), (3, 4)) xs: List[(Int, Int)] = List((1,2), (3,4))
scala> List.unzip(xs) <console>:7: warning: method unzip in object List is deprecated: use xs.unzip instead of List.unzip(xs) List.unzip(xs) ^ res0: (List[Int], List[Int]) = (List(1, 3),List(2, 4))
scala> xs.unzip res1: (List[Int], List[Int]) = (List(1, 3),List(2, 4))
scala> val m = xs.toMap m: scala.collection.immutable.Map[Int,Int] = Map((1,2), (3,4))
scala> m.keys <console>:8: warning: method keys in trait MapLike has changed semantics: As of 2.8, keys returns Iterable[A] rather than Iterator[A]. m.keys ^ res2: Iterable[Int] = Set(1, 3)
Two parts of the old libraries were replaced wholesale. For these deprecation warnings were not feasible.
So, if your code uses either jcl or projections there might be some minor rewriting to do.
You've now seen how to use Scala's collection in great detail. Scala's collections take the approach of giving you powerful building blocks rather than a number of ad hoc utility methods. Putting together two or three such building blocks allows you to express an enormous number of useful computations. This style of library is especially effective due to Scala having a light syntax for function literals, and due to it providing many collection types that are persistent and immutable.
This chapter has shown collections from the point of view of a programmer using the collection library. The next chapter will show you how the collections are built and how you can add your own collection types.
[1] Partial functions were described in Section 15.7.
[2] Hash tries are described in Section 24.9.
[3] Immutable bit sets of elements in the range of 0 to 127 optimize the array away and store the bits directly in a one or two Long fields.
[4] "Trie" comes from the word "retrieval" and is pronounced tree or try.
[5] The "buf.type" that appears in the interpreter responses in this and several other examples in this section is a singleton type. As will be explained in Section 29.6, buf.type means the variable holds exactly the object referred to by buf.
[6] Assuming bits are densely packed.
[7] An exception to this is arrays: applying delayed operations on arrays will again give results with static type Array.
[8] An actor is a thread that can communicate with message passing; see Chapter 32.
[9] Note that -Xmigration is an extended option, so it starts with an X.