Exploring Streams in Scala
In this blog post I'm going to explore Streams in Scala's Collection API. So first of all what is a Stream? A Stream
is a lazy evaluated list. This means that elements in a Stream
gets only evaluated when they are needed. Therefore Streams
can be infinite while strict collections cannot. A Stream
can also be seen as an immutable Iterator
.
Creating a Stream
To create a Stream
at least two values must be given:
- an initial value
- a function to compute the next value
For demonstration of an infinite Stream
I'll choose a Stream
of natural numbers (0, 1, 2, ...). The set of natural numbers starts with zero what is the initial value of our Stream
. The successor of the initial value can easily computed by n+1
. Then the successor of n+1
is n+1+1
and so forth. To build such a Stream
we could write the following function:
def from(start: Int): Stream[Int] = Stream.cons(start, from(start + 1))
The Stream
is created by calling the cons
function. It takes two parameters. The first parameter is the initial value. The second parameter takes a function which returns another Stream
. The cons call returns a new Stream
. The Stream
consists of the initial value and the “rest“. The “rest” will only be evaluated when needed.
Ok, I lied a bit. cons
feels like a function but in reality it is a object with an apply
method. When we call Stream.cons(...)
we call the apply
method. Written out in full it would be Stream.cons.apply(...)
but we don‘t really want to write it like that.
The same function can also be written in a shorter notion using the #:: operator:
def from(start: Int): Stream[Int] = start #:: from(start + 1)
Such common builder functions like from
, range
, continually
and so on for the creation of Stream
s can be found in the Stream object.
Now consider the following example:
val nn = from(0) println(nn.take(10).mkString(","))
A variable nn
is created and an infinite Stream
of natural numbers is assigned. At this point the from
function is only called once. Recursive calls of from(start + 1)
doesn't happen at this point. The next line takes the first 10 elements of the Stream
and prints them to the console. Now guess how many times the from function will be called? Right 9 times, because the initial value (zero) has already been initialised.
0|1|2|3|4|5|6|7|8|9|...
(Just call nn.toString
on the REPL to verify)
Filtering/ Mapping Stream
s
A Stream
can be used like every other collection because it is of type LinearSeq
. Operations like filter and map are lazy evalutated. They simply return another Stream
and perform the operation when required. I think we can say that all operations that return another Stream can be considered as lazy operations (like filter, map). Of course, operations like foldLeft, foldRight, find and exists are not lazy.
For example to filter even values:
val even = nn filter (_ % 2 == 0) println(even.take(10).mkString(","))
The 1st line calls filter on the Stream
of natual numbers. Calling the filter method will return a new Stream
. The new Stream
uses the original Stream
as its input and only returns elements for which the filter conditiion holds true. In short: filter is lazy evaluated. The second line prints the first 10 even elements to the console. This should be: 0,2,4,6,8,10,12,14,16,18
Now the first 20 elements of the natural number‘s Stream
has been initialized:
0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|...
Streams can be used in pattern matching
A Stream
can be used in pattern matching. To do so you can use the #:: extractor. The following example prints out the string "matched" on the console if the first two elements of our nn Stream
is 0 and 1. And yes, it does :-)
nn match { case 0 #:: 1 #:: _ => println("matched") }
Conclusion
Stream
is the immutable equivalent to Iterator
. While an Iterator
doesn't keep computated values Stream
does. So as with any immutable data structure you don't have to care about state with Streams
. This is useful when you pass a Stream
around to other functions. If you do so with a Iterator
you have to care about state what needs much more caution and can lead to logic errors. The immutability of Streams
makes it also quite easy to compose them. Keep in mind that a Stream
has a higher memory footprint than an Iterator
because it keeps the computed elements.