Skip to main content

Degenerate Implicits in C#

Scala and Haskell both have very useful type-directed concepts for resolving implicit function parameters, called "implicits" and "type classes" respectively. To illustrate the impact of this, consider a generic sorting function like Enumerable.OrderBy. It must define two overloads, one that takes a custom comparer, and one that does not and so should use a default comparison for type TKey:

public static IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    IComparer<TKey> comparer
)
public static IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector
)

While it's simple to define the second overload as calling the first with Comparer<TKey>.Default for the custom comparer, in general, for N overridable parameters, you would need to define something like factorial(N) overloads for all possible permutations of defaults and overridable parameters. This can quickly become unwieldy. Ideally, we could define only a single function with optional parameters and sane default values.

C# 4.0 introduced a much more limited form of implicits called "default values" to address part of this problem. These implicit values are limited to only primitive types like Int32 and String, so they aren't useful for general resolution of implicit values, but they were a good start.

Fortunately, C# has a few other features that, when combined with default values, allow us to add something like the more flexible implicit values as found in Scala: reified generics and value types. Basically, we define the implicit parameter to be some sort of value type, whose default value is always defined, and this value type resolves to either the default instance, or to a custom instance if one is provided. For instance, we can define a single OrderBy overload like so:

public static IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    CompareImplicit<IComparer<TKey>> comparer =
      default(CompareImplicit<IComparer<TKey>>)
)

Where CompareImplicit can be defined something like this:

public struct CompareImplicit<T>
{
    IComparer<T> cmp;
    static IComparer<T> def = Comparer<T>.Default;
    public CompareImplicit(IComparer<T> cmp) { this.cmp = cmp; }
    public IComparer<T> Resolve()
    {
        return cmp ?? def;
    }
    public int Compare(T left, T right)
    {
        return Resolve().Compare(left, right);
    }
}

You can then write any sort of instance resolution for the IComparer<T> value you like, and still allow callers to specify optional overrides. For instance, here's an example of using the built-in comparison and overriding it with a "reverse" comparer:

public static class Example
{
    sealed class Reverse : IComparer<int>
    {
        public int Compare(int left, int right)
        {
            return right.CompareTo(left);
        }
    }
    public static IEnumerable<T> OrderBy<T>(
      IEnumerable<T> source,
      CompareImplicit<T> cmp = default(CompareImplicit<T>))
    {
        return source.OrderBy(x => x, cmp.Resolve());
    }
    public static void Main(string[] args)
    {
        var orig = new[] { 1, 66, 4, 32 };
        var inorder = OrderBy(orig);
        var revorder = OrderBy(orig, new CompareImplicit<int>(new Reverse()));
        Console.WriteLine("in order: {0}", inorder.Aggregate("", (seed, x) => seed + x + ", "));
        Console.WriteLine("rev order: {0}", revorder.Aggregate("", (seed, x) => seed + x + ", "));
        Console.ReadLine();
        // prints:
        // in order: 1, 4, 32, 66,
        // rev order: 66, 32, 4, 1
    }
}

I reused the Enumerable.OrderBy overload for clarity, but in principle, the OrderBy function would normally just call Resolve() to access the IComparer instance to use locally. We can define similar implicit arguments to use for resolving instances of IEqualityComparer, and so on, but there's perhaps a more uniform, reusable approach using only a single struct for all such implicits:

public struct Implicit<T>
{
    public readonly T Instance;
    public Implicit(T instance)
    {
        this.Instance = instance;
    }
    public T Resolve(T defaultInstance)
    {
        return Instance != null ? Instance : defaultInstance;
    }
    public static implicit operator Implicit<T>(T instance)
    {
        return new Implicit<T>(instance);
    }
}

With a little more work for the library writer, this allows us to use a single struct for all implicit parameters, and the receiving function will simply need to provide the default instance if no override is specified. For instance, here's OrderBy rewritten using this type:

public static class Example
{
    public static IEnumerable<T> OrderBy<T>(
      IEnumerable<T> source,
      Implicit<IComparer<T>> cmp = default(Implicit<IComparer<T>>))
    {
        return source.OrderBy(x => x, cmp.Instance ?? Comparer<T>.Default);
    }
}

In order to avoid hard-coding the source of the default implicit in every function, it's better to use dependency injection. This will will decouple defining the default value for implicits from the uses, thus allowing more flexibility:

public static class Example
{
    public static IEnumerable<T> OrderBy<T>(
      IEnumerable<T> source,
      Implicit<IComparer<T>> cmp = default(Implicit<IComparer<T>>))
    {
        return source.OrderBy(x => x, cmp.Instance ?? IoC.Resolve<IComparer<T>>());
    }
}

Technically, you don't even need the Implicit type in parameter list, as all reference types can be null be default. However, the presence of Implicit is good documentation implying that some sort of implicit instance resolution is taking place, and that this process can be overridden.

Finally, there are still some unfortunate C# warts in this approach. For instance, C# does not allow implicit conversions to or from interface types, so you will always have to explicitly construct the CompareImplicit or Implicit wrapper around a custom instance, even though Implicit defines an implicit conversion. This is annoying, but hardly the only time C# has restricted expressiveness for flimsy aesthetic reasons. I would also define an extension method to simplify lifting a custom value into an Implicit wrapper:

public static class Implicits
{
    public static Implicit<T> AsImplicit<T>(this T instance)
    {
        return instance;
    }
}

The previous code example will now look like this:

public static class Example
{
    ...

    public static void Main(string[] args)
    {
        var orig = new[] { 1, 66, 4, 32 };
        var inorder = OrderBy(orig);
        var revorder = OrderBy(orig, new Reverse().AsImplicit());
        Console.WriteLine("in order: {0}", inorder.Aggregate("", (seed, x) => seed + x + ", "));
        Console.WriteLine("rev order: {0}", revorder.Aggregate("", (seed, x) => seed + x + ", "));
        Console.ReadLine();
        // prints:
        // in order: 1, 4, 32, 66,
        // rev order: 66, 32, 4, 1
    }
}

In my opinion, not too bad given C#'s inherent limitations.

Comments

svick said…
This looks like too much work just to avoid using null. The following code will work just as well, and doesn't need any custom structs:

public static IEnumerable<T> OrderBy<T>(
IEnumerable<T> source,
IComparer<T> cmp = null)
{
return source.OrderBy(x => x, cmp ?? Comparer<T>.Default);
}
Sandro Magi said…
I noted that in my post, and I addressed why the extra notation is useful: because it's good documentation to show that a parameter is implicitly resolved to some default that isn't visible. If I just provided you with the following function:

public void DoSomething(Foo x = null);

What is the intent and meaning of x = null above?

Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre...

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD) , and decided to take a crack at distilling its essence in C#. Note that automatic differentiation (AD) is different than numerical differentiation . Math.NET already provides excellent support for numerical differentiation . C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries . Neither of these are suitable for learning how AD works. So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique. What is Automatic Differentiation? Simply put, automatic differentiation is a technique for calcu...

Building a Query DSL in C#

I recently built a REST API prototype where one of the endpoints accepted a string representing a filter to apply to a set of results. For instance, for entities with named properties "Foo" and "Bar", a string like "(Foo = 'some string') or (Bar > 99)" would filter out the results where either Bar is less than or equal to 99, or Foo is not "some string". This would translate pretty straightforwardly into a SQL query, but as a masochist I was set on using Google Datastore as the backend, which unfortunately has a limited filtering API : It does not support disjunctions, ie. "OR" clauses. It does not support filtering using inequalities on more than one property. It does not support a not-equal operation. So in this post, I will describe the design which achieves the following goals: A backend-agnostic querying API supporting arbitrary clauses, conjunctions ("AND"), and disjunctions ("OR"). Implemen...