Saturday, November 12, 2011

Type Unification Forbidden - More C#/CLR Irritations

I've written about quite a few irritations of C#/CLR, including asymmetries in CIL, oddly unverifiable CIL instructions, certain type constraints are forbidden for no reason, delegate creation bugs, the lack of higher-kinded types, equality asymmetries between events and IObservable, generics/type parameter problems, lack of usable control over object layouts, and just overall limitations of the CLR VM.

I've just smack into yet another annoying problem: type parameter unification is forbidden. The Microsoft Connect bug filed in 2004 was closed as By Design.

Here's a simple code fragment demonstrating the problem:

class ObserveTwo<T0, T1> : IObservable<T0>, IObservable<T1>
{
}
This will fail with the error:
'ObserveTwo<T0,T1>' cannot implement both 'System.IObservable<T0>' and 'System.IObservable<T1>' because they may unify for some type parameter substitutions
This is frankly nonsense. What's the problem if T0 and T1 unify? If T0=T1, ObserveTwo just implements IObservable<T0>, and the methods are all unified as well [2]. If T0!=T1, then the usual semantics apply.

The MS Connect bug implies that a type safety issue, but there's no problem that I can see [1].

This is incredibly frustrating, because interfaces are the only way to design certain extension methods (mixin-style). For instance, suppose we have a set of simple interfaces:

// marks a class as containing a parseable parameter of type T
public interface IParseable<T>
{
   HttpRequest Request { get; }
}
// continuation parameters are parseable
public interface IContinuation<T> : IParseable<T> { }
public interface IContinuation<T0, T1> : IParseable<T0>, IParseable<T1> { }
...
This is a perfectly sensible set of type definitions, and there is no ambiguity or type safety problem here, even if a class were to implement IContinuation<int, int>.

If anyone has any suggestions for workarounds, I'm all ears. Options that don't work:

  1. Can't turn IParseable into a struct/class with an implicit conversion, because implicit conversions don't work on interfaces.
  2. Can't define MxN overloads of the extension methods defined on IParseable, where N=number of IContinuation type definitions, and M=number of type parameters (although this actually has a shot of working, the code duplication is just ludicrous).

A Partial Solution

The best solution I have involves N overloads of the extension methods by implementing interfaces that designate type parameter positions:

// type param at index 0
public interface IParam0<T> { }
// type param at index 1
public interface IParam1<T> { }
// type param at index 2
public interface IParam2<T> { }
...
// continuations  have indexed type parameters
public interface IContinuation<T> : IParam0<T> { }
public interface IContinuation<T0, T1> : IParam0<T0>, IParam1<T1> { }
...
Now instead of defining extension methods on IParseable, I define an overload per IParamX interface (N overloads, 1 per type parameter).

This necessarily causes an ambiguity when two type parameters unify, since the compiler won't know whether you wanted to call the extension on IParam0<int> or IParam2<int>, but this ambiguity happens at the call site/at type instantiation, instead of type definition. This is a much better default [1], because you can actually do something about it by disambiguating manually. I've eliminated ambiguity entirely in my library by appending the type parameter index to the extension method name.

Of course, none of this boilerplate would have been necessary if type unification were supported.

[1] If the CLR supported type inequalities, instead of just type equalities, then we could just forbid this at the point ObserveTwo is created.

[2] EDIT 2011-11-13T11:45 AM: there's been some confusion about this statement. Doug McClean rightly points out that there's no guarantee that the two implementations are confluent, so unifying is not an "automatic" thing, so please don't take this to mean that I'm saying the compiler should be able to automatically "merge" methods somehow, or magically know which implementation to dispatch to based on context. This is undecidable in general.

Also, even if the interfaces require no implementation, the error still occurs, despite reduction being Church-Rosser. Finally, plenty of nice properties are violated on the CLR, so I'm not sure why also sacrificing confluence is such a big deal here. A sensible dynamic semantics, like always dispatching to the implementation for T0 in case of type parameter unification, restores most of the desirable properties without sacrificing expressiveness, and without going whole-hog and ensuring confluence via type inequalities [1].

9 comments:

Doug McClean said...

The problem is that your definitions for IObservable T0 and IObservable T1 might not be confluent, and if T0 = T1 the compiler has no basis upon which to choose between them.

Sandro Magi said...

Consider if IParseable were simply a marker interface containing no implementation. Reduction is Church-Rosser, but this case is still forbidden.

Confluence and expressiveness would be ideal I agree, but this requires type inequalities, which are a pipe dream for the CLR.

I would rather have a type safety + predictable dynamic semantics, even if non-confluent, than not be able to express certain programs which I can audit for confluence, or for which confluence does not matter.

The CLR already has plenty of other features that defeat parametricity and such, so why pick on this one?

Sandro Magi said...

I meant to say, by "predictable dynamic semantics", it could just dispatch to the first type parameter implementation upon unification.

Andrey said...

I really don't see any real need for this feature. First you don't need marker interfaces in C#, because you have attributes. Second, it feels slightly indetermenistic when you have two different methods for same call and CLR have to choose. If you have real need for it somewhere I guess it can be refactored to more elegant solution.

Sandro Magi said...

A dynamic type test is 1000x faster than an attribute lookup.

The "indeterministic" property you want is called "confluence". Doug McClean already raised this point, so check the comments for how this is addressed.

Qwertie said...

It does seem a little silly not to allow type unification for interfaces. For classes that implement interfaces, the restriction only makes sense when unification is actually a problem--so the error should occur when you try to define two methods that may unify, not merely when you declare two interfaces that may unify. I'm guessing this one is a CLR limitation rather than a C# limitation?

Sandro Magi said...

@Qwertie, good question. I expect it's a CLR restriction, but I haven't actually tested it.

Ottokrat said...

To some extent, this can be done! I use a differentiating method.

It does not unify, in fact it might be better that way because you can tease the separate interfaces apart.

See my post here:
http://stackoverflow.com/a/12361409/471129

Sandro Magi said...

Doug, you may be interested to know that the CLR does actually permit implementing the same interface twice for two different type parameters, and allows them to be unified, as long as the interfaces are implemented at two different levels in the inheritance hierarchy.

Confluence is retained because the most recent implementation is always invoked. See the article I just posted on the topic if you're interested.