In my programming languages class, I have come to the point where I talk about language and library support for concurrent programming. As I revise my slides about the fork-join framework, I need to do something about the bullet that Java 7 will have parallel arrays. Sadly, it won't, but Scala 2.9 does, and they are really easy to use. Read on if you want to see how to keep your cores busy with just a few keystrokes.
In my programming languages course, we have come to the one subject that students fear more than continuations—concurrency. My inbox fills with desperate queries about programs that yield mysterious results, simply hang, or in the worst case, cause the computer to overheat and reset. Here in Vietnam, that's surprisingly easy with a multi-core laptop whose fan was calibrated for an air-conditioned room. When all those cores are busy and the room temperature is 30° C, the fan can't keep up.
I am revising the lecture about the fork-join framework. A couple of years I
told my students that the framework was going to be included in Java 7, and
that ParallelArray
would make it easy to parallelize common array
algorithms. Except that's not how it happened—ParallelArray
needs closures, and Java 7 isn't getting closures, so that part got dropped.
Scala is coming to the rescue—Scala 2.9 has parallel collections, and they are easy to use. Try it out:
val a = new Array[BigInt](1000000) for (i <- 0 until a.size) a(i) = i val sum = a.par.sum
The .par
method turns a regular collection into one whose
operations are parallelized.
Is it any faster? Sure it is. Use this function for timing a block of code:
def time(f: => Unit) = { val start = System.currentTimeMillis f System.currentTimeMillis - start }
Then run
time { a.par.sum } // I got 48 on a dual-core machine time { a.sum } // I got 85
Scala has closures, of course, so we can easily make queries:
val evens = a.par.count(_ % 2 == 0)
Parallelization works nicely with for
comprehensions. For
example,
for (i <- (1 to 10).par) { print(i + " ") } // Prints something like 3 6 7 8 1 5 9 4 10 2
Of course, that's the same as (1 to 10).par.foreach(print(i + "
"))
.
As you can see, multiple threads execute the block. In fact, that's what I should have done to keep my cores busy when initializing the array:
for (i <- (0 until a.size).par) a(i) = i
What's not to like? As always, Scala lets us do things that can't be right, such as
var count = 0 for (i <- (1 to 1000000).par) { if (i % 2 == 0) count += 1 } count // I got 373812, not 500000
Here, multiple threads try to increment the counter—a classic race
conditon. Scala programmers have to know not to do this, but to call
the count
method instead. It knows how to safely combine the
results.
In Clojure, this wouldn't be an issue. That will be the next lecture.