Even though Geoff tipped me off to it weeks ago, I just now got around to reading about it. It is Software Transactional Memory, and it is Haskell‘s way of dealing with shared memory in concurrent applications.

The motivation is right on the mark: shared memory currently is typically done through locks and condition variables. These are error-prone for a whole host of reasons, as everyone knows already: you might not lock enough locks; you might lock too many locks; you might lock locks in the wrong order; the list goes on. None of these problems will the compiler help you with and, sadly, even testing will rarely catch many of these errors. The bug might only arise years after the software has been released.

Simon Peyton-Jones’ solution is Software Transactional Memory (STM hereafter). It should not be a surprise, as it comes from Peyton-Jones, that STM is:

  1. extremely elegant;
  2. offers guaranteed semantics by the static typing system;
  3. uses monads to compose primitives together; and
  4. offers little in the way of an intuitive efficient implementation.

It’s unfortunate that point #4 so often is a consequence of points #1 through #3. But, I suppose my research is meant to address that. I’ve said it in other points, but it bears repeating: monads are awesome. Still today a lot of people have the impression that monads are only useful for I/O. This is really not true—in my own Haskell coding, I/O work is in the minority of work I do with monads. In a hand-waving way, monads provide a good way for describing how functions should be composed together. They’re an excellent structure for all sorts of things—and I’m convinced shared memory is one of those things—in enforcing some sort of semantics.

In the case of STM, the semantics we’re enforcing are what Peyton-Jones describes as atomicity and isolation. The first is self-explanatory: STM “actions” are atomic. The second is just supposed to mean that the actions of one thread don’t affect the actions of another. Again, it stands repetition: these semantics don’t follow from ideal programming practices or anything like that; they are strictly enforced by the typing system. It is not possible to not have these properties.

We have a couple of primitive STM actions: reading a “transactional variable” (shared data structure); writing to a transactional variable; creating a new transactional variable. We have some fancy functions to compose actions together as well, beyond those offered by the monad: we can block on some condition, and we can have an alternative action in case the first one is blocking. Composing many STM actions together gives you a bigger STM action.

Finally, we have a function to bring an action from the STM monad into the IO monad. Or, stated another way, we have a function to execute an STM action.

This is the composition model, and it provides the four properties I mentioned earlier. One property I didn’t mention is that this strictly prevents transactional variables from being accessed from outside an STM action. Hooray.

This is actually extremely cool. It gives a high-level view of shared memory which is good, and has some nice guaranteed semantics. The problem is with in the implementation. There are two naïve implementations of this:

  1. every time we execute an STM action, lock some big giant global lock. When we’re done, unlock that big giant global lock. This is slow and stupid and prevents any sort of interesting concurrency; or
  2. don’t lock until a transaction is complete. When executing an STM action, write its actions to a transaction log. When you’re done the log, lock, ensure the memory you were using was still consistent, commit the log, and unlock. The problem is that if the memory isn’t consistent, you have to re-execute the action. Not good.

What we’re missing is a way to express the fact that one STM action has no possible effect on another STM action. If we had that information, we could use smaller locks.

Overall the STM model is a good one. It takes a philosophy that I like: allow the programmer to express what’s going on at a very high level, and let the compiler worry about making sense of it. But it’s going to take a whole lot of inference for the compiler to separate these STM actions to avoid making global locks. Transaction variables can be passed around as parameters, and one transaction variable can contain another transaction variable—keep in mind these are data structures just like any other data structures—and except in the most obvious cases, it becomes too easy for the inference code to throw up its hands and say “you know what, this is just turning into a complete graph. I’m just going to have to use one lock for everything”. Maybe there is room for doing this stuff partially at run-time, via tagging, but that’s messy as well.

There actually may be some research potential in this for me down the road. STM is new and on its baby legs right now. Sorting out a good implementation for STM would have consequences for other issues in functional programming, predominantly with aliasing. And who knows, maybe one day extremely high-level languages could find a place in high-performance computing.