Wednesday, January 16, 2008

Towards the best collection API... in C#. And some partial applications too.

The venerable Oleg Kiselyov once posted about the "best" collection traversal API. Let's call this ideal iterator a "SuperFold". LTU also covered his article. Essentially, a SuperFold is a left fold with early termination support. Any cursor can then be automatically derived from the SuperFold. The converse is not true.

Additional arguments are made in the above paper and in the LTU thread, so I won't repeat them here. Without further ado, I present the SuperFold for my purely functional list in C#:
//OCaml signature: ((T → B → bool * B) → B → B) → (T → B → bool * B) → B → B
B SuperFold<B>(
Fun<Fun<T, B, Pair<bool, B>>, B, B> self,
Fun<T, B, Pair<bool, B>> proc,
B seed)
bool cont;
proc(head, seed).Bind(out cont, out seed);
return cont ? self(proc, seed) : seed;

While quite simple, it's not as efficient as it should be since C#/.NET doesn't support proper tail calls. You can see in that source file that I derived FoldLeftDerived from SuperFold. Deriving FoldRight is a bit trickier, so I have to think about it. The simple, inefficient, answer is to simply reverse the list.

I've also enhanced the FP# library with:
  1. A number of tuple types (up to 10),
  2. Tuple inference,just call: Tuple._(a,b,c,...)
  3. streamed versions of Map/Filter/FoldRight over IEnumerable which don't build intermediate lists,
  4. FoldLeft over IEnumerable including support for early termination,
  5. FoldLeft over my purely functional list including support for early termination,
  6. Option type now looks more like System.Nullable, with an overload for the | operator to choose between an empty option, or a default value [1],
  7. Partial application for almost all defined function types
I'll probably remove SuperFold from my purely functional list, since I don't think it will end up being useful in C#. The iterated FoldLeft I defined is more efficient due to the lack of tail recursion, and the iterated version also supports early termination.

[1] It's very frustrating to see how close MS gets to a truly general and useful abstraction, only to lock it down for no apparent reason. What good reason is there for Nullable to be restricted to struct types? If you give it more than a second's thought, I think you'll realize that there is no good reason. Nullable is the option type if it weren't for this restriction!


leppie said...

As C# does not support tail calls (the CLR does however support them just fine), you should not use a recursive construct. Rewriting the recursion to use iteration is fairly simply in the case, and should be employed.

Sandro Magi said...

1. The CLR unfortunately does not support tail calls just fine, it supports them poorly, despite work demonstrating that tail calls are not in conflict with stack inspection.

2. I think there is no iterative equivalent of SuperFold, due to the mutual recursion between self and proc.