Googling around for memoization and Haskell brings up many posts about how easy it is in theory, with some example code on how to do it in a specific instance. However, I couldn’t find a really general solution to the problem.

Template Haskell to the rescue. Today I coded up a macro to do this in a fairly general way. This way you can write code in the most natural way possible:

$(memoize [d|   fib 0 = 1
                fib 1 = 1
                fib n = fib (n - 2) + fib (n - 1) |])

And the macro transforms it into memoizing code that will store previous results in a linked list of arrays. The linked list was required because arrays in Haskell are of finite size, whereas I wanted arrays of potentially infinite size.

The good news is that it works very well. Without memoization, evaluating a recursive fibonacci really strains once you hit an n of about 30. With memoization, my 1.6GHz G5 can do an n of 20000 in one second—for comparison, a natural iterative C version will do about 20000000 in one second, a thousand times as much, though to be fair the C version is not using bignums. Also, the code is general enough that it works over any parameters which are of the Enum and Ix classes—basically any data type which provides a function to map it to ranges of integers. Mutually recursive memoized functions work fine: once a function has been memoized, it behaves and can be called exactly like it was in its original representation.

The bad news is it’s a huge memory leak. The memoizing arrays are always reachable, so they never get garbage collected. In the case of fibonacci, once I tried an n of about 50000 or so on my machine it was game over: memory usage was such that it was hitting swap, and there was no hope of any memory ever being reclaimed. The other bad news is that I’ve only written it to memoize the first parameter. This limits its utility in many dynamic programming problems which operate over 2 parameters.

Once I clean it up a bit I think I’ll post it on the old webpage. I’ll add multiple parameter support, and maybe flip the structure of the memoizing array list so the most recently calculated values are closest in memory: hopefully once things go to swap they stay there, even if they never get reclaimed.

There may be some hope for allowing values to be forgotten about and reclaimed by the garbage collector. I’ll have to do some reading to see what other people have done here. Unfortunately a purely functional language like Haskell relies heavily on not forgetting things so that it can reuse effort—convincing it’s semantics that you want to forget about something and then recalculate it later is ironically quite difficult in this case.

Overall it’s an okay tool, I think. The state of things right now is that, in the world of Haskell, dynamic programming Sucks. This should help things out.

Advertisements