Skip to main content

First-Class Slots for .NET

Yesterday, I posted about the new extension of IRef<T> to arbitrary reference semantics for the CLR, including referencing inner fields, properties, and array slots. First-class references make it simple to operate on specific mutable data without caring about the underlying type of that data.

I just pushed another abstraction that handles a related, but different case: first-class slots.

ISlot<T>

An object "slot" is a value that designates a mutable location of a specific class of values, not a mutable location of a specific instance like first-class references. Where first-class references hide the underlying object type, slots expose the object type and allow you to mutate the slots of multiple objects at once, as long as they are subtypes of the slot's object type. Here's the declaration of ISlot<T>:

public interface ISlot<TObj, T>
{
    T Get(TObj obj);
    T Set(TObj obj, T value);
}

As you can see, the object we're manipulating is passed in while operating on it, so an ISlot is actually a direct descriptor that can access members of any instance of a subtype of TObj. If you're wondering when you would ever need this, rest assured that there are applications that need to write algorithms of this type. For instance, consider an object-relational mapper (ORM), which uses reflection to extract the members that need to be get/set on object flush/load from a database. Essentially, the ORM is reflecting over all slots of an object being flushed or loaded from a database, but it does so in a manner that isn't very reusable, and the object hydration code is coupled tightly to the database access code as a result.

Reifying slots as a distinct, first-class abstraction makes them independently testable, and the reflection code and database access code is now very loosely coupled. An ORM is but one example of an application that makes use of generic object slots.

Similar to first-class references, the Slot class exposes some static constructors for creating slots:

public abstract class Slot
{
  public static Member<TObj, T> Create<TObj, T>(Func<TObj, T> get, Action<TObj, T> set);
  public static Array<T> Create<T>(int index);
  public static Member<TObj, T> Create<TObj, T>(MemberInfo member)
  public static Member<TObj, T> Create<TObj, T>(Expression<Func<TObj, T>> member)

  public sealed class Array<T> : Slot, ISlot<T[], T> { ... }
  public sealed class Member<TObj, T> : Slot, ISlot<TObj, T> { ... }
}

These operations are exactly what you saw in the last article, where you can create slots from object members and array indices.

It might seem at first that slots are strict generalizations of first-class references, but this is deceptive. It's true that any algorithm you'd could write using references could be rewritten to use slots, but the number of type parameters and value parameters could increase non-linearly to the point where it's unwieldy and could easily obfuscate the underlying algorithm.

Limitations

There is one limitation at the moment. Specifically, value types need to be passed by reference during update in order for assignment to be visible to callers of ISlot.Set, but this isn't currently possible given the interface. As a result, there's currently a type constraint on TObj to restrict it to reference types.

A simple solution would simply be to return TObj from ISlot.Set, so the calling context can simply overwrite its own local value with the one modified by the slot. Another possibility is to make the TObj parameter to ISlot.Set a by-ref parameter. I'm considering these and a few other options, and Sasa's v0.12.0 release will probably contain the final solution.

Comments

I've used a similar abstraction which I called "mutable lenses". Initially I tried to use real immutable lenses, but the boilerplate required to define lenses in C# is astounding. So I went for a mutable variant, but that of course requires a mutable object, and it lacks the composability of immutable lenses. Still a useful abstraction.
I also experimented briefly with an ORM prototype based on this, but I quickly ran into the problem of having to pack all the lenses for a class together (e.g. in a list), something that's seemingly impossible to do in C# as it would require an heterogeneous list. You could pack it as existentials, but then you'd lose type information. You could pack it in an untyped list, but that's of course not type-safe. How did you solve this?
Sandro Magi said…
"So I went for a mutable variant, but that of course requires a mutable object, and it lacks the composability of immutable lenses. Still a useful abstraction."

I think it depends on how you exploit mutability. If you haven't already, read up on immediate mode UIs. They build a UI representation from a model in a continuous, imperative loop.

You could adapt this to the lens concept by simply constructing an immediate mode counterpart of your lens algorithm. Lenses are algebraic representations of updateable views, and immediate mode is a coalgebraic representation.

Coalgebraic representations are much more natural in OO languages like C#, and thus they compose well, but it takes awhile to start thinking coalgebraically because algebras are so ingrained.

"You could pack it as existentials, but then you'd lose type information."

I think any solution for .NET will have to hide some type information in some way, simply because C# types aren't expressive enough. As to how to do this, you can hide the types behind a well defined interface. ORM hydrate is essentially:

Func<Source, T>

ORM flush is essentially:

Action<T, Sink>

Then a Foo:

class Foo
{
public int Bar{get; set;}
public string Something;
}

has hydrate/flush ops:

static Func<Source, Foo> Hydrator()
{
return source =>
{
var x = new Foo();
x.Bar = source.Int32();
x.Something = source.String();
return x;
};
}

static Action<Foo, Sink> Flusher()
{
return (x, sink) =>
{
sink.Int32(x.Bar);
sink.String(x.Something);
};
}

You can make a combinator library to build up the hydrate/flush functions incrementally too (there are functional pearls covering this), because delegates can be combined on .NET:

static Action<T, Sink> FlushInt32<T>(Func<T, int> getter)
{
return (x, sink) => sink.Int32(getter(x));
}

static Action<T, Sink> FlushString<T>(Func<T, string> getter)
{
return (x, sink) => sink.String(getter(x));
}

static Action<Foo, Source> Flusher()
{
return Func.Combine(FlushIn32(x => x.Bar), FlushString(x => x.Something));
}

But really all the above is doing is performing a coalgebraic transform on an object graph into some serialized form. We can make this a fully generic transform, and in fact, I already did which is my type-safe generic reflection abstractions under Sasa.Dynamics.

Start with a low-level reflection interfaces, one for folding over an object:

interface IFold
{
void Field<T>(T value, FieldInfo info);
void Property<T>(T value, PropertyInfo info);
}

And one for unfolding:

interface IUnfold
{
T Field<T>(FieldInfo info);
T Property<T>(PropertyInfo info);
}

Then you need to perform a type case on the basic primitive types:

interface IReduce
{
void Bool(bool x);
void Int32(int x);
void Type(Type x);
...
Object<>(T x);
}
interface IBuild
{
bool Bool();
int Int32();
Type Type();
...
T Object<T>();
}

Then you can write a single piece of code that can fold any object graph into a serialized format, and perform the corresponding unfold from the serialized format back into an object graph.

Foo, from above, would serialize like so:

typeof(Foo)
+-Property(Foo.Bar)
+--typeof(Int32);
+--Int32(Foo.Bar);
+-Field(Foo.Something);
+--typeof(String);
+--String(Foo.Something);

(+- is stack depth)

You can see that all the relevant type information is present, so you can unfold the structure safely as well. Obviously you also need to handle recursive structures and other special cases like arrays and Nullable structs, but it's totally doable.

I recently realized that Sasa.Dynamics is essentially scrap-your-boilerplate generics as found in Haskell. They take the same approach using polytypic folds/unfolds as the foundation of reflection.
Thanks for the detailed answer! You just gave me a lot to think about here :)

Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre...

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD) , and decided to take a crack at distilling its essence in C#. Note that automatic differentiation (AD) is different than numerical differentiation . Math.NET already provides excellent support for numerical differentiation . C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries . Neither of these are suitable for learning how AD works. So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique. What is Automatic Differentiation? Simply put, automatic differentiation is a technique for calcu...

Building a Query DSL in C#

I recently built a REST API prototype where one of the endpoints accepted a string representing a filter to apply to a set of results. For instance, for entities with named properties "Foo" and "Bar", a string like "(Foo = 'some string') or (Bar > 99)" would filter out the results where either Bar is less than or equal to 99, or Foo is not "some string". This would translate pretty straightforwardly into a SQL query, but as a masochist I was set on using Google Datastore as the backend, which unfortunately has a limited filtering API : It does not support disjunctions, ie. "OR" clauses. It does not support filtering using inequalities on more than one property. It does not support a not-equal operation. So in this post, I will describe the design which achieves the following goals: A backend-agnostic querying API supporting arbitrary clauses, conjunctions ("AND"), and disjunctions ("OR"). Implemen...