Progress on the Ca compiler is going well. The lambda lifter is done. I really couldn’t be much more pleased about how things are going.

I decided now would be a good time to clean up and document the code. I’m a big fan of literate programming and so have been investigating literate programming tools. No matter how long your multi-line comments, I find documentation inside code to be completely useless for getting “big picture” concepts from source code. Literate programming seems to be a good solution.

The idea behind literate programming is that you have one source file. This source file is both a typeset document source file and a programming language source file. For example, through Donald Knuth’s CWEB tool, you write a foo.w file which becomes both a foo.tex file and a foo.c file.

My experience with literate programming up until this weekend has been a bit of a lie. It was with Literate Haskell, which Wikipedia—I now realize correctly—says is semi-literate, not literate. In Literate Haskell, basically you interleave LaTeX code and Haskell code, and the Haskell compiler is clever enough to only compile the Haskell code. It’s cute and it worked well enough to do my thesis, but it’s not really literate programming.

If you’ve never sat down and played with Knuth’s WEB or CWEB, I would highly recommend you do so. Knuth’s vision of literate programming is not just to interleave documentation and code, but, to borrow the names of Knuth’s own tools—to weave and tangle them together. Documentation and code are not interleaved; they’re indistinguishable, almost. CWEB gives you the freedom to refactor and reorder your code in ways that make it look totally unlike something intelligible by a C compiler, to make it look like literature—quite a feat for C—but to give you a valid program at the end.

In a sense it’s not really fair to compare Literate Haskell with CWEB, as they’re working in different domains. For example, one of the nice things about CWEB is that it lets you reorder code. This doesn’t exist in the world of Haskell, since Haskell never really had any substantial restrictions on the reordering of code in the first place.

Anyway, my Ca compiler cac is currently made out of two languages: Haskell and C. Eventually I’ll have to find a way to deal with its grammar as well, but I’ll cross that bridge later. Literate Haskell works well enough for the Haskell side of things. It’s quite inoffensive; you can structure your LaTeX code however you like.

For the C side of things, I’ve spent most of today installing tools, reading their documentation, playing with them, trying to get them to work. I played with noweb and nuweb, but neither seemed too impressive. Once I installed CWEB, things improved a bit.

CWEB really is just a flat-out brilliantly written piece of software. It does exactly what you want it do, a whole lot more, all simply, logically, and robustly. It weaves together documentation and code beautifully. There is only one problem with it: it produces Plain TeX code. I don’t know Plain TeX.

No problem; there is a software out there called latex-CWEB which is a LaTeX class that will render CWEB documentation. Note, I said class. So far as I can tell, this means any document you create with latex-CWEB documentation cannot be an article; it cannot be a report; it cannot be a book; it cannot be any type of document other than a “cweb” document. This is a bit restrictive if you consider that you might want a document which contains things other than CWEB documentation. Such as Literate Haskell documentation.

Since, as I said before, Literate Haskell is so inoffensive, it probably will be workable, and it might turn out to be the best option.

The other competing option is to try and learn Plain TeX. I’d never considered it before; I was one of those “why use TeX when we have LaTeX?” kind of people. But my opinion is changing a bit after seeing how CWEB renders my humble C code: it’s gorgeous.

One final note before I head off to bed. It’s frustratingly come to my attention all throughout today just how marginalized literate programming is. I did some Google searches for a “literate IDE”. Needless to say I didn’t find anything. It really is a shame. Literate programming is a fantastic idea, but all we have now is a smattering of simple mostly one-off tools, almost none of them having been maintained since the mid 1990s, all incompatible, all structuring things in different ways. The state of the art right now is that it really is a big hassle to write literate code unless you either consign yourself to second-rate documentation or else commit yourself to exactly one language with exactly one tool.

Advertisements