I’m migrating my research blogging over to my new domain and consequently this blog won’t be used any more. It’s served me well and I’ll leave it be for posterity.


A few weeks ago I finally got IPv6 working on my router via 6to4. Of course this has no real use beyond just playing around, but it’s still neat that now my computer is globally addressable. There are a few IPv6 hosts around, too; today I took a look at the output of netstat on my machine on totally unrelated business and noticed that I was connected over IPv6 to some web page I was browsing at the time. After I got 6to4 hooked up, incidentally, I emailed my ISP to see if they had any IPv6 plans. Much to my delight, they replied saying they already have their IPv6 address block allocated and will be rolling it out to customers at some point in the near future. No timeline, of course, but still very very cool.

While teaching my networking course this term, I took a lecture to talk about SCTP. It’s a beautifully designed transport layer, designed to replace—maybe not entirely, but mostly—TCP and UDP. Like IPv6 it has the problem that it was designed after “mostly good enough solutions” had already been established. Unlike IPv6 it doesn’t need any help from ISP or backbone infrastructure, but it has in some ways a much bigger problem: it needs application-level protocols to use it if it’s to gain any support.

I felt a little guilty teaching SCTP when I’d never actually played with it seriously myself, so I gave it a try today. Unfortunately most consumer operating systems don’t support SCTP: OS X requires a kernel extension.

One of the neatest things about SCTP is that a single SCTP connection—”association” in SCTP lingo—can be multiplexed into any number of streams and these streams really do operate in parallel. There’s no head-of-line contention like in TCP where the whole connection gets plugged up if a packet arrives out of order. In SCTP, if one stream gets plugged up, all the other streams are entirely unaffected. This also has the benefit, of course, that you can have a multiple connections without the overhead of creating multiple connections; research evidently like to see how this will apply to HTTP. For me, I would have guessed the nicest application of SCTP would be SSH tunnelling and port forwarding and whatnot and I was a little disappointed to see no one’s taken a look at SSH over SCTP in a serious way.

One of the more troublesome aspects of SCTP is that the number of streams your “association” contains is negotiated at initialization time; by my reading of the RFC, there’s no way to add or remove streams dynamically.

So this morning I set up a little experiment to see how SCTP—or Michael Tuexen’s kernel extension for OS X anyway—would fare with absurd numbers of streams. I figured if you could create an association with a tonne of streams without any noticeable performance degradation, the limitation of having to specify the number of streams up front wouldn’t be so bad.

The bad news is that even though the RFC does give you theoretically 65000 streams, I started getting “Invalid argument” errors as soon as I started getting up to about 2000 streams. The good news is there was no change at all in performance or server/client behaviour whether I had/used 1 stream or 2000 streams. I suspect there are some possibly large memory structures built inside the kernel when you establish an association with 2000 streams, but that’s hard to see from userspace. If I get time I’ll poke through the source code to see how things are structured.

The other bad news is that SCTP is slow. Sending data as fast as I could to ::1 (my local machine, not over the network) was about a quarter as fast as TCP. I don’t think there’s much about SCTP that would require this slowness; it’s almost certainly due just to the immaturity of the implementation. This implementation, if I remember right, was taken from FreeBSD. I’ve heard stories that Linux’ SCTP implementation is the worst of all, much slower even that FreeBSD’s, which is sad, but maybe things have improved recently.

There’s another big advantage of SCTP—multi-homing—which I haven’t had time to play with. With multi-homing I could, theoretically, establish an association/connection over wired Ethernet, then transition to WiFi, with the connection totally unbroken and, what’s more, the application-layer totally unaware that anything had happened. I’m thinking with a bit of work one could even migrate SSH sessions from one host to another. It could be cool beans.

I’m sure there’s some irony in me putting effort into playing around with these networking protocols just so I can talk to myself. Now I know how Esperanto speakers must feel. It’s fun, though.

When I get the opportunity to do research these days, it’s mostly focused on trying to formalize the abstract syntax, typing and operational semantics from the programming language’s point-of-view, not the logic’s point-of-view, once and for all. Once of the major problems is that the typing for Pola is complicated, much more complicated than typing systems I’ve done in the past. On the one hand this makes me very pleased with myself that I was able to get type inference going as well as I have, but on the other hand it’s a huge obstacle to getting other people interested in what we’re doing.

I apologize for the sloppy prose, but this is from a rough draft of notes I’m keeping for myself. The notation used in this typing rule is actually much simpler than what I’d had previously.

It’s horrendously confusing and yet, through many revisions, it’s the simplest and most straight-forward I’ve been able to come up with to describe what’s happening at the type level. Sometimes I think it would be easier to just show people the code instead of the sequents, ha!

Anyway, I shall keep at trying to simplify this stuff. It’s further emboldened the motivation I have for prettying of the “face” of this language for mortal users.

The paper submitted to Typed Lambda Calculi and Applications has been rejected. Again, all of the commentary was spot on. Our notation was not introduced well; the notation we used was odd; things were not explained properly. There is a consistent problem in our write-ups of trying to fit too much into a paper. We do have a lot to present: the syntax, the logic of safety, the type checking, the type inference, the operational semantics, the soundness proof, the completeness proof, the bounding argument, the relationship to previous works. It would be flat-out impossible to fit even a quarter of these into one conference paper, I would wager, but nonetheless it shouldn’t be too much trouble to write up a proper exposition that we can properly explain to people what we’re up to.

Two of our referees gave us a weak accept and honestly I felt bad for them since both of them obviously they put a lot of effort into it. Both of them agreed something worthwhile was being explained, but it was being explained too poorly.

In the meanwhile, I’m making slow and steady progress. Things will pick up once the term is over and I can forget about teaching for a while. I’m determined this won’t happen again.

I always feel bad about not updating here, but I never have much to say. Or, perhaps equivalently, I have too much to say. The nature of Pola, the polynomial-time programming language I’m working on, is constantly changing and I wouldn’t really want to explain all the details here anyway. Suffice it to say that the spirit of the language remains essentially the same and the details of the language change day-to-day, a slight exaggeration.

I’m newly committed to better balancing my research and my teaching. Teaching had taken over my time completely this term and research had fallen by the way-side, which is really a shame. I’ve changed that this week, though, and have got a lot of good work done.

I gave a short talk about my research at the Western Research Forum, an annual mini-conference here at UWO for graduate students of all disciplines. I’m pleased to announce that I won second place for the natural sciences session. First place went to someone dealing with meteorites: it was also my choice for first place, so I have no problems losing to that. After my presentation I got a lot of good question which, if nothing else, shows I could keep people’s interest. The questions didn’t come exclusively from computer scientists, either, but from other physical scientists curious about how this might relate to all the FORTRAN code they have to work through. It was good to see.

The highlight of the conference was seeing Jorge Cham, author of Piled Higher and Deeper. I very highly recommend going to see him talk if you get the chance. It’s a funny talk, of course, but carries a lot of insight with it, too. I’ve uploaded one picture from the talk which made a big impression on me.

Talking with people about my research, I always go back and forth between my motivations and where I want to focus my research. When talking with people in theory or in programming languages, they seem to be most interested in proving theoretical properties of the language and making sure that’s all very solid. When talking with everyone else, the prevailing opinion is “yes, but no one would ever use it”, reinforcing that the “face” of the language, the syntax and the typing system, is most important. I need to get a good balance between the two.

The paper we submitted to PLDI (Programming Language Design and Implementation), sadly, was not accepted. It’s not a huge surprise: our paper wasn’t a perfect fit for PLDI; further, our work was in a mild state of flux and the paper reflected that. We’ve submitted a paper to TLCA (Typed Lambda Calculi and Applications) and are anxious to get feedback there.

In just over two hours, I’ll be on a plane to Calgary to do more research there. I’m going to be doing some work at the airport and on the plane trying to get the implementation into a good state so we can start doing research right away. Particularly I want to seriously talk about inferring bounds.

Naïve bounds inference is theoretically done, though not implemented yet. Naïve bounds inference is too, well, naïve, however. We need to tighten up the bounds a lot for it to be practical. I’m hoping we can get some time to discuss how bounds inference affects the memory model as well.

So, what we have now is: a functional language allowing inductive and coinductive types—and now even mutually recursive types!—such that every well-typed program halts in polynomial time; an implementation, complete with type inference; inference of time and space bounds.

At the end of this week things should be in really good shape! I’ll post an update, as I’ve been neglecting this blog for too long.

This image is stolen from a reddit link:

It shows ideas in computer science moving from academic research to industrial research to product to successful product. Speech recognition I find pretty amusing.

Being me, I’m disappointed no programming language concepts made it to the chart, not even objects! I can’t think off the top of my head what the first commercial object-oriented development system was: I suppose in those days there wasn’t a clear divide between selling hardware and selling software.

Anyway this is relevant to me since, when contemplating my future after my Ph.D., I often fantasize about bringing my own research to market. The first major fear I have, of course, is that no one would want to spend much money on a compiler that guarantees resource bounds; I hope I’m wrong there. The second fear, though, is that it would take far longer than I could ever expect to turn a proof-of-concept product into a marketable product. This chart confirms that, historically, there is a gap of about five years at least to turn a research idea into a consumer idea.

The chart also shows the importance of industrial research and development, the blue lines. With some notable exceptions, like Microsoft and IBM, I fear industrial research has been getting squeezed out in the past few years, which is a shame.

We need more blue!