OS X


I just fixed a rather stupid bug that had been bugging me for hours. This is rather Cocoa and Objective C specific, so bear with me.

My application has an NSTableView—a widget to display a table of data. An NSTableView doesn’t store its own data; rather, it has a delegate object called a data source that supplies data to it. In order for the NSTableView to get its data in a well defined manner, it requires that its data source implement two methods: numberOfRowsInTableView: and tableView:objectValueForTableColumn:row:. These two methods return the number of rows in the table and the value at a particular column/row, respectively. Side note: why does the NSTableView need to query for the number of rows, but not the number of columns? Because it’s extremely poorly designed, that’s why.

Anyway, normally the way one would require that an object implement some method is via a “protocol.” A protocol in Objective C is what Java calls an interface, basically just some static guarantee that an object implements something. Using a protocol allows fun stuff, like the ability to verify at compile-time that an object will implement some method.

However, for reasons that are not clear to me, NeXT/Apple did not make the data source into a protocol. They made it what they call an “informal protocol,” which basically means that things will be checked at run-time, not at compile-time or link-time. Another side note: being C based, you might expect that Objective C would be a static language. Quite the opposite. It’s very common to, for example, add methods to an object or class at run-time.

Anyway, I got my program all working perfectly. Until I tried to build a “release” version of it. I narrowed it down to: if I compile it with -O0, it works perfectly; if I compile it with -O1 or higher, suddenly NSTableView doesn’t think that my data source responds to the appropriate methods. Since NSTableView doesn’t trust my object, it doesn’t display any data.

After an hour or so of trying to narrow down the bug, I finally found and fixed it. It cropped up out of my stupidity more than anything else—of course—but some stronger typing would have helped things out a little more.

The object model of Objective C, as I said before, is entirely dynamic, modelled on Smalltalk. A side-effect of this is that the object model of Objective C is not statically typed. I’m lying a little bit, but only a little bit. So the “C” in Objective C is statically weakly typed, and the “Object” part in Objective C is dynamically weakly typed.

The problem came when I wrote the “init” method for my data source. In NeXT/Apple’s Objective C framework, creation of an object happens in two stages. The “alloc” and friends methods deal with memory allocation, and the “init” and friends methods deal with object initialization. This is kind of handy because it allows memory allocation and object initialization to be independent and orthagonal. You can do stuff like: [[Foo allocWithZone: [self zone]] initWithSize: 5], using a combination of allocator and initializer that the class creator might not have thought of.

Anyway, because I’m an idiot, for some reason when I was writing my class’ initializer, I thought that “init” didn’t have a return value. So I went on my merry way and wrote:

- (void)init
{
  [super init];
  size = 0;
  /* other initializations ... */
}

That (void) you see before “init” is the return value and comes into play only now and then and even then only sort of. Its intention is to deal with C being statically typed. Another diversion: in Objective C, every message of the same name has the same type signature. So init in one method has the same type signature as init in another. Thus, any code actually calling my init method is none the wiser that I’ve erroneously assumed it has no return value.

So calling code calls init and looks in register eax (on Intel, or r3 on PowerPC) and assumes that that’s the return value. By coincidence, it happened to work out perfectly for -O0 but not for -O1 or higher. So, after changing the method to:

- init
{
  self = [super init]; /* yes, I know this is weird. it works in Objective C */
  if (self) {
    size = 0;
    /* ... other initializations */
  }
  return self;
}

Everything worked perfectly. If only there had been some sort of type error to clue me in!

I suppose it’s further proof that Haskell has corrupted me forever. For my hobby projects, I’m typically doing Objective C Cocoa these days. Usually when I find myself working in non-Haskell languages, I get frustrated at the lack of laziness. Laziness is a wonderful thing. Treating a computation as an infinite data structure just makes your life a lot easier in a number of ways, most notably being able to reuse components that were designed to work on finite data structures. From hereon, every time I say “data structure”, pretend I mean “collection” or whatever they’re called these days: abstract data types that are meant to hold a bunch of things.

The more I work, though, the more I’ve noticed that I’ve fallen into a strange design pattern: I practically never pass around actual data structures (unless they’re mutable and I’m adding something to them, for example); I practically always pass around an enumerator—NSEnumerator in Cocoa.

It’s really quite wonderful. The most immediate properties of this are:

  1. I don’t have to waste time wondering if I’m operating on an NSSet or NSArray or whatever, or if they have a common superclass;
  2. I’d have to create an NSEnumerator eventually anyway to do anything useful, so I’m not adding any new overhead;
  3. the big one, if you haven’t guessed already: it’s trivial to introduce infinite data structures, i.e., data structures which aren’t “data” at all, but rather code.

I whipped up what I call CallbackEnumerator and its friend, the Enumeratable protocol (protocols are what Java calls interfaces. The world of OO wouldn’t be complete without 3 contradictory nomenclatures). The Enumeratable protocol contains just the -enumerateNext message (and variants to allow for arguments), basically as a means to pretend that a certain object is actually a closure. The CallbackEnumerator class does nothing but transform NSEnumerator messages into Enumeratable messages.

So far it hasn’t been terribly useful, as I’m only using it in one spot in my program, in my tokenizer, the idea being that you create a tokenizer with a string and, as if by magic, it instantly gives you an NSEnumerator with all the tokens. Arguably I could have just made a special TokenizerEnumerator and skipped the whole CallbackEnumerator nonsense, but I like the potential for reuse.

Where the enumerator fails is providing structure. The enumerator does a wonderful job at simulating a lazy list. But what about the lazy tree? It seems rather pointless to create an enumerator-a-like for each abstract data type.

In general, I think the role of the enumerator in object-oriented languages like Objective C is backwards. What would make more sense is, for each abstract data type, have an “enumerated” subtype. For example, for NSArray, there could be an NSEnumeratedArray, such that NSEnumeratedArray is a subclass of NSArray. NSEnumeratedArray operates exactly like NSArray, and has the same methods, but rather than containing a giant chunk of memory from which to return elements, it executes code. Similarly for NSDictionary or any other abstract data type.

Consider then the -objectEnumerator method that NSSet implements. What this is really saying to me, semantically, is a transformation of the elements of NSSet, so that they are no longer represented as a set, but rather are represented as a lazy list/array. So, replace that method with -enumeratedArray and you have the same thing. I should mention that, in Objective C, there is a sharp divide between immutable objects and mutable objects. NSArray is necessarily immutable, and thus it’s easy to say that an NSEnumeratableArray would respond to the same messages as an NSArray. As to what an NSMutableEnumeratableArray would look like, that’s another matter.

The benefit is that you are no longer fudging two ideas together. An enumerator is two things: it is a way of representing laziness; and it is a flat, ordered view of elements. There can be a separation between the two. So you can have your NSEnumeratedArray, which is exactly as an NSEnumerator is now: it provides you a way to get the elements out of a list, one at a time, sequentially. Or you could have an NSEnumeratedDictionary or NSEnumeratedTree or what have you.

I suppose it’s somewhat telling that I had to invent types to prove the point. Unfortunately, AppKit—the “foundations” part of Cocoa that provides abstract data types—doesn’t offer much beyond arrays, sets, and dictionaries, so maybe there isn’t much of a need. Still, it’s a bit telling that while all of these classes have initializers like -initWithObjects: or -initWithSet:, there is no -initWithEnumerator: to be found. The concept of chaining lazy data types together doesn’t really exist in the API, which is unfortunate.