Thursday, March 14, 2013

Sasa-v0.9.4-RC2 Released - A Sasa.Dynamics Overview

I've just uploaded the next release candidate for Sasa-v0.9.4. This features a few changes due to problems I ran into while porting old code to the new codebase. Not bugs per se, but unfortunate limitations of certain Microsoft libraries. There are also a few new extensions written to showcase Sasa.Dynamics, which are detailed below.

Sasa.Dynamics

The experimental type safe reflection facilities of Sasa v0.9.3 were completely revamped since they weren't general enough to handle all of the CLR's and C#'s constructs. The main problem was that the ref-based callback interface wouldn't integrate well with readonly fields. The only time you can create a reference to a readonly field is in the constructor, so clients couldn't implement the old IReflected interface on any immutable objects.

The new approach in Sasa.Dynamics separates two orthogonal concepts: 1. typecase, and 2. folding over an object's fields.

Typecase is an operation from the research literature which takes an abstract type parameter T, and dispatches to the concrete type handler to which it corresponds. You can see this plainly in the IReduce interface under Sasa.Dynamics:

public interface IReduce
{
    void Boolean(bool value);
    void Int16(short value);
    void UInt16(ushort value);
    ...
    void Array<TArray, TElement>(TArray value)
        where TArray : TypeConstraint<Array>
    void Object<T>(T value);
}

This interface defines a callback interface that the static class Type<T> uses to process values of the type parameter T:

public static class Type<T>
{
    public static Action<IReduce, T> Reduce { get; }
}

If T is a sealed type, the delegate obtained from Type<T>.Reduce is actually an open instance delegate that is effectively a function pointer that directly dispatches into IReduce, which is as efficient as you can get on the CLR. If T is not a sealed type, then T.GetType is invoked and a short sequence of efficient, dynamic type tests are performed in order to dispatch to the correct handler. This is done via a small polymorphic inline cache, similar to what is done in the DLR and with C#'s new "dynamic" type.

Now Reduce accepts a value and calls back into an interface to process that value, and the results of that computation are internal to that operation and can be extracted later. This is required due to certain CLR limitations, like the lack of first-class polymorphism on delegates and higher rank types.

As I've argued previously, any abstraction you add should be accompanied by its dual. In the typecase pattern, I've called this Build, whose operations are implemented via the IBuild interface. It's identical to the IReduce interface with an important difference: instead of accepting a value as a parameter, it returns that value:

public interface IBuild
{
    bool Boolean();
    short Int16();
    ushort UInt16();
    ...
    TArray Array<TArray, TElement>()
        where TArray : TypeConstraint<Array>;
    T Object<T>();
}

The Type<T> interface for this dispatch is similarly inverted:

public static class Type<T>
{
    public static Func<IBuild, Type, T> Build { get; }
}

Similar to the case with non-sealed reduce, this operation accepts a dynamic type parameter and performs a short series of dynamic tests via an inline cache to dispatch into IBuild. In effect, the build/reduce pattern allows you to write algorithms to transform between type variables and runtime type values. IBuild and IReduce are defined with a case for each primitive type on the CLR, like numbers, strings, delegates, arrays, nullable structs, and encapsulated objects.

Clearly the last case is important, since we need some way to unpack an encapsulated object into its constituent fields in order to process a whole object graph. We achieve this via fold/unfold, which has the same callback structure as build/reduce, but is only a few small operations. Here's fold:

public interface IFold
{
    void Field<T>(T value, FieldInfo info);
}
public static class Reflect<T>
{
    public static Action<T, IFold> Fold { get; }
}

For each field in an object of type T, Reflect<T>.Fold calls into IFold.Field<TField> to process the field values. TField may not be a sealed type, but at this point you can just call Type<TField>.Reduce to jump to the appropriate type.

Like build/reduce are duals, so it is with fold/unfold. IUnfold doesn't accept a parameter of TField, it returns a value of type TField. Therefore the majority of operations you'd use for reflection can be implemented as some interleaving of reduce/fold for reducing an object graph to a value (like serialization), or build/unfold for unpacking a value into an object graph (like deserialization).

Sasa.Serialization is a proof of concept binary serialization assembly built on top of Sasa.Dynamics. I haven't benchmarked this latest iteration, but previous iterations have been dramatically faster, and produced significantly more compact binaries than framework binary serialization.

Additionally, there are a few example operations built into Sasa.Dynamics that showcase the power of this approach. Type<T> exposes a few additional operations, all of which are implemented in terms of build/reduce and fold/unfold:

public static class Type<T>
{
    /// <summary>
    /// A function that can efficiently dispatch to a type-specific
    /// handler based on type <typeparamref name="T"/>.
    /// </summary>
    /// <returns>
    /// A function that can efficiently dispatch to a type-specific
    /// handler based on type <typeparamref name="T"/>.
    /// </returns>
    public static Func<IBuild, Type, T> Build { get; }

    /// <summary>
    /// Return a function that can efficiently inspect the internals
    /// of any object of type <typeparamref name="T"/>.
    /// </summary>
    /// <returns>A function that can inspect a
    /// <typeparamref name="T"/>'s internals.</returns>
    public static Action<IReduce, T> Reduce { get; }

    /// <summary>
    /// A delegate that checks whether a given instance is mutable.
    /// </summary>
    /// <remarks>
    /// Unlike Type<<typeparamref name="T"/>>.MaybeMutable,
    /// this checks whether the specific instance
    /// given is mutable, rather just whether the type allows mutable
    /// extensions.
    /// </remarks>
    public static bool IsMutable(T value);

    /// <summary>
    /// Performs a deep copy of an object.
    /// </summary>
    /// <param name="value">The value to copy.</param>
    /// <returns>A new instance whose whole object graph has
    /// been replicated.</returns>
    public static T Copy(T value);

    /// <summary>
    /// True if any instances of a <typeparamref name="T"/>
    /// may be mutable.
    /// </summary>
    /// <remarks>
    /// This will return true for any non-sealed types, since a subtype
    /// of <typeparamref name="T"/> may add a mutable
    /// field at any time. Sealed types with even one non-read
    /// -only field are also considered mutable.
    /// </remarks>
    public static bool MaybeMutable { get; }
}

IsMutable is an operation that checks whether an instance of type T is transitively mutable via any of its instance fields. MaybeMutable is a more efficient but conservative operation, which returns true even if T appears immutable, but is non-sealed, eg. a subtype of T could easily add a mutable field at some point.

Copy simply performs an efficient deep copy of an object graph. Both of these extensions require very little code to implement, are inherently type safe, and use relatively efficient dispatching code for little overhead. All the unsafe code is behind Reflect<T> and Type<>.

Documentation

You can find the newly generated documentation here. The new docs are generated with Sandcastle, so they're a bit heftier than the old documentation.

I will also be writing more blog posts to describe the goodness available in v0.9.4. I've address a lot of C# and CLR shortcomings in type safe ways which are sure to find some good use. If you're impatient, read up on some of my past posts tagged with "sasa" for some of the highlights.

No comments: