Skip to main content

ML Modules in C#

I've written about the limitations of C#'s equational constraints before. Truth is, I now believe that any such limits can be circumvented by a relatively simple translation. The result is less "object-oriented", as it requires a set of cooperating objects instead of being encapsulated in a single object.

Let's take a simple, unsafe list flattening operation described in Generalized Algebraic Data Types and Object-Oriented Programming (GADTOOP). This can be expressed in OCaml as:
let List = struct 
type 'a t = Nil | Cons of 'a * 'a t
let append l a = Cons(a, l)
let flatten la =
Cons(a, l) -> append a (flatten l)
| Nil -> Nil
end

The argument to flatten, la, is a list of lists of type 'a. However, there is no way to express this in C# without unrestricted equational constraints as I described earlier. Here is the translation to C# from GADTOOP:
public abstract class List<T> {...
public abstract List<T> Append(List<T> that);
public abstract List<U> Flatten<U>();
}
public class Nil<A> : List<A> {...
public override List<U> Flatten<U>()
{ return new Nil<U>(); }
}
public class Cons<A> : List<A> {...
A head; List<A> tail;
public override List<U> Flatten<U>()
{ Cons<List<U>> This = (Cons<List<U>>) (object) this;
return This.head.Append(This.tail.Flatten<U>()); }
}

Note how Flatten requires an unsafe, ugly cast to and from object to get around C#'s type limitations. However, there is a safe way. Here is the translation:
static class List
{
public abstract class t<A> {}
public sealed class Nil<A> : t<A> {}
public sealed class Cons<A> : t<A> {
internal A head; internal t<A> tail;
}
public static t<A> Append(t<A> l, A a)
{ return new Cons<A>(a, l); }
public static t<A> Flatten<A>(t<List<A>> l)
{ return l.head is Nil ? new Nil<A>() : Append<A>(Flatten<A>(l.tail), l.head); }
}

Basically, the step-wise translation I propose:
  1. move the generic class I into a nested generic class O.I
    class I<T> {}
    becomes
    class O {
    class I<T> {}
    }

  2. move the instance methods of that generic class into generic methods of the enclosing class, I.m -> O.m
    class I<T> {
    I<T> m<U>(U u) { ... }
    }
    becomes
    class O {
    class I<T> {}
    I<T> m<T, U>(I<T> i, U u) { ... }
    }

  3. mark all relevant state of I "internal", so the outer class O can access it.
    class I<T> {
    private T value;
    public T get() { return value; }
    }
    becomes
    class O {
    class I<T> { internal T value; }
    public T get(I<T> i) { return i.value; }
    }

  4. now you can express arbitrary constraints on the type structure in the methods of O.
    class O {
    class I<T> {}
    I<T> m2<T, U>(I<I<T>> i, I<I<U>> u) { ... }
    ...
    }

Because the external methods do not depend on an implicit 'this' argument, we no longer need equational constraint refinements to express the required type structure. Ordinary C# types suffice!

If you squint a little more closely, you'll also note a much closer symmetry in this version of the code and the OCaml code. In fact, this translation essentially builds ML modules in C#!

Like all widely-used module systems, it's pretty clear that these modules are "second-class". Only rebinding the name is possible, via C#'s "using" directive, and everything is resolved and fixed at compile-time. By this, I mean that you can import different list implementations and it will compile cleanly as long as the other list implementation defines the same operations with the same type signatures:
using L = List;
...
L.Flatten(l);
...
Or:
using L = AnotherList;
...
L.Flatten(l);
...

The compile-time restriction is primarily due to the declaration as a static class. Making the class non-static permits runtime binding of concrete implementations to signatures, so it's a little more flexible, and just as safe. Loosening this restriction may also make runtime "functors" possible.

I've used this pattern to complete Orc.NET, because the 'where' combinator required an inexpressible dependency. You can see my use in the Orc interpreter. The "Orc" object in Interp.cs is essentially an "expression builder", and I suspect that all such "builder" implementations are really ML modules at their core.

An open question is the interaction of inheritance with such modules. Seems like inheritance is a particular type of functor from structure S to S.

In any case, if you need type constraints which are inexpressible in C#, then make them ML modules using the above translation, and add object-orientedness back in incrementally. On a final note, I find it amusing that OO languages must resort to functional techniques to resolve fundamental OO limitations. I'd much prefer if we could just use functional languages instead and forgo all the hassle. ;-)

Comments

Edward Kmett said…
F# uses a similar translation.

http://web.mit.edu/fsharp_v1.1.12/FSharp-1.1.12.3/manual/export-interop.html
Sandro Magi said…
Thanks for the pointer. I looked at some F# a little while ago, so perhaps some of what I read percolated unconsciously into this translation.

Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre

Building a Query DSL in C#

I recently built a REST API prototype where one of the endpoints accepted a string representing a filter to apply to a set of results. For instance, for entities with named properties "Foo" and "Bar", a string like "(Foo = 'some string') or (Bar > 99)" would filter out the results where either Bar is less than or equal to 99, or Foo is not "some string". This would translate pretty straightforwardly into a SQL query, but as a masochist I was set on using Google Datastore as the backend, which unfortunately has a limited filtering API : It does not support disjunctions, ie. "OR" clauses. It does not support filtering using inequalities on more than one property. It does not support a not-equal operation. So in this post, I will describe the design which achieves the following goals: A backend-agnostic querying API supporting arbitrary clauses, conjunctions ("AND"), and disjunctions ("OR"). Implemen

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD) , and decided to take a crack at distilling its essence in C#. Note that automatic differentiation (AD) is different than numerical differentiation . Math.NET already provides excellent support for numerical differentiation . C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries . Neither of these are suitable for learning how AD works. So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique. What is Automatic Differentiation? Simply put, automatic differentiation is a technique for calcu