April 16, 2007
I stumbled across this presentation today. It was presented by Simon Peyton-Jones and talks about a new way of doing parallelism under Haskell. Much cooler than that STM nonsense.
The presentation explains things much better than I could, but in contract to STM, basically it’s moving the parallelism annotations away from the code and into the data types. So now we have the concept of a “parallel array” of floats. The key to the coolness is that parallelism can be nested, as is shown quite well in his parallel quicksort example. How does it work? Easy as pie is how it works!
Actually its operation is pretty straightforward, as long as you’re not the one implementing it. Nested parallelisms conceptually get “flattened” into one level of parallelisms. After that it’s all book-keeping to know when one operating should be dispatched to some other thread.
The other key to making it fast is that composed functions—and after the parallel code gets transformed, there are a lot of composed functions—get “fused” together. “Fusion” seems to be the new buzzword in Haskell. I swear, it must have shown up in just about every ICFP paper in the past few years. Here’s an example of fusion advocacy. Fusion basically seems to be the new wave of deforestation, which used to be the new wave of listlessness. In any case, it’s a fancy program transformation that allows you to specialize/inline one function according to another, thus eliminating intermediate products. It’s cool and only works when you don’t have side-effects.
In contrast to STM, I have to say, Nested Data Parallelism does look vaguely practical, even with a straightforward implementation. There’s even the obligatory “grain of salt” benchmark slide about a third of the way through demonstrating how awesome it is. In one of his first slides, too, he makes reference to running Haskell code on a GPU. I wonder what he actually said about that in his talk.
Anyway, if you skip down to the last slide, you see the typical Simon Peyton-Jones conclusion. Basically “there are programming problems that are too complex for a programmer to deal with himself. But there’s a solution! But this solution only works with pure functional languages! That’s why Haskell is the future! All we have left to do is implement it efficiently”. He may have his own mini-Jobs reality distortion field, but I actually believe him.
April 16, 2007 at 8:52 pm
One thing I think I should point out is that nested data parallelism and STM aren’t really competing. STM is about making concurrency easier, while NDP is about making pure parallelism easier.
All the computations which NDP applies to are computations which are fundamentally deterministic — the result doesn’t depend in any way on the order in which the parallel processes get executed.
STM is working with the somewhat more difficult assumption that threads are going to need to communicate with one another, and the order in which these communications occur can affect results. (It mitigates that effect by chunking things into transactions which occur as-if-atomically, but at the larger scale, that nondeterminism is still present, or else it just wouldn’t be concurrency.)
Interestingly, both techniques rely on the ability to control the scale and kind of side-effects present. With NDP, you essentially have no side effects, and with STM, you have restricted side effects inside transactions (basically only reading and writing of transactional variables), and unrestricted side effects in the thread as a whole.