The JDBC
ResultSet
object plays very badly with Scala collections. The only real way of doing anything useful with it is to loop through it directly with a while
loop. For instance, to get a list of the names of physicists in our database, we could write the following code:
// WARNING: poor Scala code import Implicits._ // import implicit conversions SqlUtils.usingConnection("test") { connection => connection.withQuery("SELECT * FROM physicists") { resultSet => var names = List.empty[String] while(resultSet.next) { val name = resultSet.getString("name") names = name :: names } names } } //=> List[String] = List(Paul Dirac, Albert Einstein, Marie Curie, Richard Feynman, Isaac Newton)
The ResultSet
interface feels unnatural because it behaves very differently from Scala collections. In particular, it does not support the higher-order functions that we take for granted in Scala: no map
, filter
, fold
, or for
comprehensions. Thankfully, writing a stream that wraps ResultSet
is quite straightforward. A Scala stream is a lazily evaluated list: it evaluates the next element in the collection when it is needed and forgets previous elements when they are no longer used.
We can define a stream
method that wraps ResultSet
as follows:
// SqlUtils.scala object SqlUtils { ... def stream(results:ResultSet):Stream[ResultSet] = if (results.next) { results #:: stream(results) } else { Stream.empty[ResultSet] } }
This might look quite confusing, so let's take it slowly. We define a stream
method that wraps ResultSet
, returning a Stream[ResultSet]
. When the client calls stream
on an empty result set, this just returns an empty stream. When the client calls stream
on a non-empty ResultSet
, the ResultSet
instance is advanced by one row, and the client gets back results #:: stream(results)
. The #::
operator on a stream is similar to the cons operator, ::
, on a list: it prepends results
to an existing Stream
. The critical difference is that, unlike a list, stream(results)
does not get evaluated until necessary. This, therefore, avoids duplicating the entire ResultSet
in memory.
Let's use our brand new stream
function to get the name of all the physicists in our database:
import Implicits._
SqlUtils.usingConnection("test") { connection =>
connection.withQuery("SELECT * FROM physicists") { results =>
val resultsStream = SqlUtils.stream(results)
resultsStream.map { _.getString("name") }.toVector
}
}
//=> Vector(Richard Feynman, Albert Einstein, Marie Curie, Paul Dirac)
Streaming the results, rather than using the result set directly, lets us interact with the data much more naturally as we are now dealing with just a Scala collection.
When you use stream
in a withQuery
block (or, generally, in a block that automatically closes the result set), you must always materialize the stream within the function, hence the call to toVector
. Otherwise, the stream will wait until its elements are needed to materialize them, and by then, the ResultSet
instance will be closed.