Tuesday, November 1, 2011

Ad-hoc Extensions in .NET

A recent discussion on LtU brought up a common limitation of modern languages and runtimes. Consider some set of abstractions A, B, ..., Z that we wish supported some custom operation Foo(). Languages like C# give only two straightforward possibilities: inheritance and runtime type tests and casts.

Inheritance is straightforward: we simply inherit from each A, B, ..., Z to make A_Foo, B_Foo, ..., Z_Foo, with a custom Foo() method.

Unfortunately, inheritance is sometimes forbidden in C#, and furthermore, it sometimes prevents us from integrating with existing code. For instance, say we want to integrate with callback-style code, where the existing code hands our program an object of type A. We can't call Foo() on this A, because it's not an A_Foo, which is what we really wanted.

This leaves us with the undesirable option of using a large set of runtime type tests. Now type tests and casts are actually pretty fast, faster than virtual dispatch in fact, but the danger is in missing a case and thus triggering a runtime exception, a danger inheritance does not have.

Furthermore, we'd have to repeat the large set of runtime type tests for each operation we want to add to A, B, ... Z, which means every time we want to add an operation we have to do a large copy-paste, and every time we add a new class we have to update every place we test types, and the compiler doesn't warn us when we're missing a case.

A Solution

Fortunately, there's a straightforward way to address at least part of this problem. We can collect all the type tests we have to do in one place, and we can ensure that we have properly handled all cases that we are aware of. To do this, we combine the standard visitor pattern with the CLR's open instance delegates and its unique design for generics.

Here are the types:

// the interface encapsulating the operation to implement, ie. Foo()
public interface IOperation
{
  void A(A obj);
  void B(B obj);
  ...
  void Z(Z obj);
  void Unknown(object obj);
}
// the static dispatcher where runtime tests occur
public static class Dispatcher<T>
{
  static Action<IOperation, T> cached;
  public static Action<IOperation, T> Dispatch
  {
    get { return cached ?? (cached = Load()); }
  }
  static Action<IOperation, T> Load()
  {
    var type = typeof(T);
    var mName = type == typeof(A) ? "A":
                type == typeof(B) ? "B":
                        ...
                                  ? "Unknown";
    var method = typeof(IOperation).GetMethod(method);
    return Delegate.CreateDelegate(typeof(Action<IOperation, T>), null, method)
             as Action<IOperation, T>;
  }
}

If you want to extend A, B, ... Z, with any operation, you simply create a class that implements IOperation and the compiler will ensure you handle all known cases. If you want to add a class to the set of handled cases, you simply extend IOperation and add a case to the runtime type tests (or if you use a naming convention, you can just use the class name as the method name).

The dynamic type test runs once, and then a delegate that directly calls into the IOperation is cached, so the type tests are not run again.

If you try to dispatch on an unknown type, IOperation.Unknown is invoked. I could have made this a generic method, but the CLR currently has a bug creating open instance delegates to generic interface methods, so that would require some code generation to do properly.

There is a caveat though: if any of the cases are subtypes of each other, and you dispatch on the static base type, it will dispatch to the base type and not the super type handler. For instance, if A:B, and you call Dispatcher<B>.Dispatch(new A()), IOperation.B is called, not IOperation.A. This can be handled in various ways, but it's beyond the scope of this article. I may post a follow-up discussing the various options.

Extensions

Type Constraints on T

If all types to handle are subtypes of type X, then it's simple to constrain the type parameter T on Dispatcher<T>:

 public static class Dispatcher<T>
  where T : X
{
  ...
}

Operation Return Type

You could extend IOperation with a return type, but this requires propagating the return type parameter into Dispatcher and the cached delegate type, and thus requires more runtime state for more static fields. This is because the CLR doesn't support GADTs, so following this paper, I prefer to keep the return type encapsulated in IOperation and expose it as a public field for the caller to extract the value:

public sealed class Foo : IOperation
{
  // the return value
  public Bar ReturnValue { get; set; }
  public void A(A obj) { ... }
  ...
}

This requires fewer type parameters, and keeps the implementation simple.

Per-Class Overrides

Suppose some of the classes you are handling are your own, or are already aware of your code such that they have already implemented your custom operations. In that case, you can specify a companion interface to indicate this, and modify the dispatch code to invoke the interface instead:

public interface ICompanion<T>
{
  // your custom operations that are implemented via subclassing
  void Foo(IOperation operation, T self);
  ...
}
The dispatch code is then modified like so:
static Action<IOperation, T> Load()
{
  var type = typeof(T);
  // if companion interface is implemented, then dispatch directly into it
  if (typeof(ICompanion<T>).IsAssignableFrom(type))
  {
    var method = type.GetMethod("Foo");
    return Delegate.CreateDelegate(typeof(Action<IOperation, T>), null, method)
             as Action<IOperation, T>;
  }
  // the usual dispatch code
  ...

Useful Applications

This pattern simulates the usefulness of type classes in Haskell, because these are essentially ad-hoc extensions added after the fact. I use this exact pattern in Sasa.Dynamics to implement safe type case and reflection patterns over the CLR primitive types, ie. ints, strings, delegates, etc., with the per-class override extension described above via the IReflected interface.

6 comments:

Jules said...

That is very useful :) And your blog is a goldmine of information in general.

Sandro Magi said...

Thanks! Hopefully that clarifies what I tried to clumsily convey on LtU. I had known that the pattern I used for safe reflection was fairly general, but your question really made me think about how to convey it.

You can even eliminate the untyped reflection using LINQ expression trees to obtain the MethodInfo, if you want ultimate safety.

If only MS would fix that delegate bug, the solution would be elegant for all cases...

Qwertie said...

Okay, my brain's a bit hazy and I don't quite understand the abstract description here. I would be like to see a real-world example of a problem that this solves, that is not easily solved any other way. Can it help solve the Expression Problem?

Sandro Magi said...

@Qwertie, this solves a part of the expression problem because it does allow you to extend existing classes with new operations. I don't consider it a "real" solution because it uses dynamic type tests.

Consider a class library Foo, which provides classes B and C that inherit from A, and there's an event raised of type Action<A>, ie. A.SomeEvent += new Action<A>(x => ...).

In your event handler, you want to perform operations on 'x', but you need to know whether x is a B or a C, and A doesn't implement the visitor pattern. You would normally do this with type tests, but now you have to repeat all the type tests every single time you need to discriminate on the type.

Then, if library Foo adds a new type, you don't get any warnings or errors that you're no longer handling all of the cases.

The solution presented in this post is the best answer I've found to handle this scenario. You specify all the cases to handle in IOperation, and specify the list of dynamic type tests once, and the compiler ensures you handle all the cases everywhere else. It's far less error prone.

Sandro Magi said...

@Qwertie, I forgot to mention, that repeating the dynamic type tests everywhere is also problematic if there are many classes you want to discriminate against, for instance like LINQ expressions.

Sandro Magi said...

@Jules, you might be interested in the fully reusable dispatcher I just posted about.

It doesn't really use any unsafe constructs like reflection, as used in this post, and it matches based on the dynamic type, not the static type, so it's a little more flexible.