Sunday, September 25, 2011

Idioms in C# with LINQ

There's a great post on implementing idioms with LINQ, and the example application was to implement formlets, as WebSharper does for F#. Tomas's post is well written, so if you're unclear on the above concepts I recommend reading it first before proceeding with this article.

The claim in that post is that idioms can only be encoded via LINQ's 'join' operators. While strictly true if you stick to all the LINQ rules, because LINQ queries are just naive syntactic transforms you don't have to follow the rules. You can thus exploit this to hijack the signatures for the SelectMany overloads to yield idiom signatures. It's not all sunshine and roses though, as there are consequences.

Overview

LINQ is a standard set of methods one can implement that the C# compiler can use to provide "query patterns". This query:

var foo = from x in SomeFoo
          from y in foo.Values
          select y;
is translated by the C# compiler to:
var foo = SomeFoo.SelectMany(x => x.Values, (x, y) => y);
This is a purely syntactic transformation, meaning that the C# compiler simply takes the text from above, and naively translates each 'from', 'where', 'select', etc. into calls to instance or extension methods, SelectMany, Where, Select, etc. Type inference must then be able to infer the types used in your query, and everything must type check.

The fact that we're dealing with a purely syntactic transform means that we can be sneaky and alter the signatures of these LINQ functions and the C# compiler would be none the wiser. The resulting calls to the LINQ methods would still need to compile, but we can ensure that they only compile following the rules we want, in this case, of the rules of idioms.

LINQ Methods

The core LINQ methods are as follows, using Formlet<T> as the LINQ type:

Formlet<R> Select<T, R>(this Formlet<T> f, Func<T, R> selector);
Formlet<R> SelectMany<R>(this Formlet<T> f,
                              Func<T, Formlet<R>> collector);
Formlet<R> SelectMany<U, R>(this Formlet<T> f,
                                 Func<T, Formlet<U>>
                                 Func<T, U, R> selector);
The problematic methods for idioms are the two SelectMany calls, specifically, the parameter I've called 'collector'. You can see that the LINQ type is unwrapped and the value extracted on each SelectMany, and passed to the rest of the query. Accessing the previous values like this is forbidden in idioms.

Fortunately, the signatures for SelectMany don't have to have this exact signature, they must only have a similar structure. You must have two SelectMany overloads with one and two delegate parameters, and the first delegate parameter must return your LINQ type, in this case Formlet<T>, as this allows you to chain query clauses one after another. You can also modify the second delegate parameter in various ways, but I haven't found much use for that myself.

To implement idioms, we will simply alter the first delegate parameter so instead of unwrapping the value encapsulated by the Formlet<T>, we simply pass the Formlet<T> itself:

Formlet<R> Select<T, R>(this Formlet<T> f, Func<T, R> selector);
Formlet<R> SelectMany<R>(this Formlet<T> f,
                              Func<Formlet<T>, Formlet<R>> collector);
Formlet<R> SelectMany<U, R>(this Formlet<T> f,
                                 Func<Formlet<T>, Formlet<U>> collector,
                                 Func<T, U, R> selector);
Our query above:
var foo = from x in SomeFoo
          from y in foo.Values
          select y;
would then no longer compile, because 'x' is now not a Foo, but is in fact a Formlet<Foo>, and the formlet type does not have a "Values" property. Of course, you shouldn't provide a property to extract the encapsulated value, or this is all for naught.

The Downsides

Simple queries work great, but longer queries may run into some problems if you alter the LINQ signatures. In this case, if you try to access previous values by mistake, as in our example query above, you will get a complicated error message:

Error 1       Could not find an implementation of the query pattern for source type
 'Formlet.Formlet<AnonymousType#1>'. '<>h__TransparentIdentifier0' not found.
Basically, your incorrect program was naively translated to use the LINQ methods, but because it does not properly match the type signatures you've hijacked, type inference fails. So you can't break your idioms by hijacking the query pattern this way, but depending on your target audience, perhaps you will render them unusable.

Still, it's a neat trick that should be in every type wizard's toolbox.

4 comments:

Mauricio Scheffer said...

Nice! Thanks for this, I somehow suspected this was possible but never really checked it. I just applied it in CsFormlets

Another downside is that you can't use it if you already have a proper SelectMany (i.e. a monadic bind), as for example in the case of Either/Validation (see FSharpx)

Sandro Magi said...

Yes, I expect you can't simultaneously implement an idiom and monad interface for the same type. You'd have to implement two types and convert between them. You can do this cheaply enough with a struct wrapper around one, or perhaps by implementing LINQ an interface and then doing some explicit coercions.

Are you using CsFormlets for anything in particular? I'm developing my own portable UI library, so I'm curious what else is out there.

I'm taking the route of bidirectional data binding instead of the purely functional formlets approach. It seems to fit better with the imperative C# model that you'd find, for instance, with LINQ to SQL. You can see what I've committed publicly under Sasa.Reactive, but I have a lot more that I'm still playing with behind the scenes.

Finding a good balance that works with C#'s limitations is tough though.

Mauricio Scheffer said...

Well, technically, you can implement an applicative and monad, but of course using Join for applicative and SelectMany for the monad. Still, I'm considering using a separate type for validation (different from Either) just to keep things cleaner.

CsFormlets is a web formlets library, so if you were looking to use it in desktop apps, this is not it :) . IIRC the concept of formlets has been applied to console and desktop apps in a generic way but I doubt this is viable in .net, plus formlets seem to be only widely used for web applications (in Haskell, Racket, etc)

Formlets are in a way bidirectional (not sure in the same way you mean it): a single formlet is used to both project data to the UI and then collect input from the UI.
For a more general bidirectional tool, lenses are interesting: composable and purely functional... still trying to see if they would be truly useful in .net though.

Sandro Magi said...

Re: desktop formlets, it's certainly possible, you just have to define an abstract set of portable UI controls, and define a mapping to HTML+CSS. Formlets are generally only used in web apps because the structure of a web program is so much more difficult than the desktop where you don't have to handle partial failure, latency, disconnected operation, backtracking, etc.

Try thinking how you'd design a distributed GUI framework with all the properties we get from HTTP+HTML. You can have stateful designs like X11, or stateless designs like HTTP. If you go stateless, something like formlets is a natural development I think. A stateless GUI framework that can map to HTTP is strictly more general than a purely desktop framework.

Re: bidirectional data binding, you have it exactly right. The UI is a data structure defined in terms of the data model, and the data model itself is defined in terms of the UI. Updates to one should be reflected in the other and vv, data flow in either direction should be a well-defined operation. This well-defined mutually recursive relationship is exactly a lens.

Two possible approaches here again: a functional one, of which formlets is one design; the other is to take the imperative route based on events and reactive expressions, which is what I'm trying: use .NET events and IObservables to keep data model and UI in sync without any manual intervention aside from the initial data binding code.

So yes, lenses would be useful on .NET, and we're both trying to achieve it by the sounds of it. A lot of the convenience you get from .NET, with its easy integration between data grids, SQL sources, etc. could be done much more simply if they had a universal framework for bidirectional data binding, instead of all this custom code for a thousand APIs.