Friday, May 31, 2013

Sasa.Atomics - Simpler, More Scalable Atomic Operations

This is the nineteenth post in my ongoing series covering the abstractions in Sasa. Previous posts:

The System.Threading namespace contains a number of useful abstractions for concurrent programming. The Interlocked static class in particular ought to be in every programmer's toolbox. However, the CompareExchange API doesn't always lend itself to the cleanest algorithms. Furthermore, under medium to high contention, even atomic compare-and-swap operations don't scale well.

Enter Sasa.Atomics, which provides a simpler API for atomic operations and implements the lightweight contention management scheme from the paper described above. Constant backoff contention management requires no state, and incurs virtually no overhead in low contention scenarios, and it scales quite well under medium to high contention.

Sasa.Atomics.Set

Sasa.Atomics.Set is a set of extension methods which perform a compare-and-exchange operation, and return true if the operation succeeds:

string o = "hello";
...
if (Atomics.Set(ref o, o + " world!", o))
    Console.WriteLine(o);
// output:
// hello world!

Sasa.Atomics.SetFailed

Sasa.Atomics.Set is a set of extension methods which perform a compare-and-exchange operation, and return true if the operation fails:

string o = "hello";
...
while (Atomics.SetFailed(ref o, o + " world!", o)) { }
// output:
// hello world!

Tuesday, May 28, 2013

Sasa.Either - Simple Sums for .NET

This is the eighteenth post in my ongoing series covering the abstractions in Sasa. Previous posts:

Tuples are a pretty common necessity while programming, and every programmer has run into the need to return multiple values from a method. The algebraic description of a tuple is what's known as a "product", and the logical analogue is the "conjunction", ie. a product/tuple is type T0 AND T1 AND T2 AND ...

Given any abstraction, it's dual will often also be necessary. Imagine programming only with the logical AND operator without OR! Analogously, we also need the algebraic analogue of products which are known as "sums", ie. type T0 OR T1 OR T2 OR ...:

Either<int, string> foo = "hello world!";
Console.WriteLine(foo);
// output:
// hello world!

The need for this may not be so obvious to CLR programmers because inheritance provides a kind of sum. For instance, the above code snippet could be modelled with an explicit class hierarchy:

abstract class IntOrString
{
    public static implicit operator IntOrString(int i)
    {
        return new IntCase { value = i };
    }
    public static implicit operator IntOrString(string s)
    {
        return new StringCase { value = s };
    }
}
sealed class IntCase : IntOrString
{
    int value;
    public override string ToString() { return value.ToString(); }
}
sealed class StringCase : IntOrString
{
    string value;
    public override string ToString() { return value; }
}
...
IntOrString foo = "hello world!";
Console.WriteLine(foo);
// output:
// hello world!

However, just like with tuples, when you're only dealing with a small number of possibilities you don't want to go through the hassle of creating a whole new set of types.

Alternately, you can just use the implicit supertype of all CLR types, System.Object instead of creating a class hierarchy, but in a very real sense this is TOO permissive, since it allows more than just System.Int32 and System.String:

object foo = new Action(() => throw new Exception("Fooled you!"));
Console.WriteLine(foo);
// output:
// System.Action ...

Sasa.Either<T0,T1>, Sasa.Either<T0,T1,T2> and Sasa.Either<T0,T1,T2,T3> are generic sum types that provide this functionality for you, along with a few convenient operations to make programming with sums easier. For instance, ToString is overloaded to return the string representation of the underlying value, there are implicit conversions into the corresponding sum type, equality is structural over the various cases, and value extraction from sums is done via a simple and familiar deconstruction scheme:

Either<int, string> foo = "hello world!";
int icase;
if (foo.TryFirst(out icase))
{
    Console.WriteLine("int: " + icase);
    return;
}
string scase;
if (foo.TrySecond(out scase))
{
    Console.WriteLine("string: " + scase);
    return;
}
// output:
// string: hello world!

Sasa.Either.Coalesce

The ?? null coalescing operator only works on values of the same type, where sums can work on values of disparate types. Thus Either.Coalesce is a set of overloads that null coalesce the parameters and returns an appropriate sum encapsulating the first non-null value:

var sum = Either.Coalesce<string, float>(null, 3.0F);
Console.WriteLine(sum);
// output:
// 3.0

Sasa.Either.First

Either.First is a simple set of sum constructors for the first case of a sum:

var x = Either.First<int, string, decimal>(3);
Console.WriteLine(x);
// output:
// 3

Sasa.Either.Second

Either.Second is a simple set of sum constructors for the second case of a sum:

var x = Either.Second<int, string, decimal>("hello!");
Console.WriteLine(x);
// output:
// hello!

Sasa.Either.Third

Either.Third is a simple set of sum constructors for the third case of a sum:

var x = Either.Third<int, string, decimal>(99M);
Console.WriteLine(x);
// output:
// 99.0

Sasa.Either.Fourth

Either.Fourth is a simple set of sum constructors for the fourth case of a sum:

var x = Either.Fourth<int, string, float, DateTime>(DateTime.MinValue);
Console.WriteLine(x);
// output:
// DateTime.MinValue

Sasa.Either<*>.First

Sasa.Either<*>.First (2-param, 3-param, 4-param) attempts to extract the encapsulated value of type T0. An Option<T> is returned indicating success or failure, which means you can use all the usual coalescing operators on options with the results:

Either<int, string> foo = "hello world!";
Console.WriteLine(foo.First || 123);
// output:
// 123

Sasa.Either<*>.Second

Sasa.Either<*>.Second (2-param, 3-param, 4-param) attempts to extract the encapsulated value of type T1. An Option<T> is returned indicating success or failure, which means you can use all the usual coalescing operators on options with the results:

Either<int, string> foo = "hello world!";
Console.WriteLine(foo.Second || "Impossible!");
// output:
// hello world!

Sasa.Either<*>.Third

Sasa.Either<*>.Third (3-param, 4-param) attempts to extract the encapsulated value of type T2. An Option<T> is returned indicating success or failure, which means you can use all the usual coalescing operators on options with the results:

Either<int, string, decimal> foo = "hello world!";
Console.WriteLine(foo.Third || 99M);
// output:
// 99.0

Sasa.Either<T0, T1, T2, T3>.Fourth

Sasa.Either<T0, T1, T2, T3>.Fourth attempts to extract the encapsulated value of type T3. An Option<T> is returned indicating success or failure, which means you can use all the usual coalescing operators on options with the results:

Either<int, string, decimal, DateTime> foo = "hello world!";
Console.WriteLine(foo.Third || DateTime.MinValue);
// output:
// DateTime.MinValue

Sasa.Either<*>.IsFirst

Sasa.Either<*>.IsFirst (2-param, 3-param, 4-param) checks whether the sum is the first case of type T0:

Either<int, string, decimal, DateTime> x = 3;
Console.WriteLine(x.IsFirst);
Console.WriteLine(x.IsSecond);
// output:
// true
// false

Sasa.Either<*>.IsSecond

Sasa.Either<*>.IsSecond (2-param, 3-param, 4-param) checks whether the sum is the first case of type T1:

Either<int, string, decimal, DateTime> x = "hello!";
Console.WriteLine(x.IsFirst);
Console.WriteLine(x.IsSecond);
// output:
// false
// true

Sasa.Either<*>.IsThird

Sasa.Either<*>.IsThird (3-param, 4-param) checks whether the sum is the first case of type T2:

Either<int, string, decimal, DateTime> x = 99M;
Console.WriteLine(x.IsFirst);
Console.WriteLine(x.IsThird);
// output:
// false
// true

Sasa.Either<T0, T1, T2, T3>.IsFourth

Sasa.Either<T0,T1,T2,T3>.IsFourth checks whether the sum is the first case of type T3:

Either<int, string, decimal, DateTime> x = DateTime.Now;
Console.WriteLine(x.IsFirst);
Console.WriteLine(x.IsFourth);
// output:
// false
// true

Sasa.Either<*>.TryFirst

Sasa.Either<*>.TryFirst (2-param, 3-param, 4-param) method extracts the encapsulated value if the sum is of type T0:

Either<int, string> foo = "hello world!";
int icase;
if (foo.TryFirst(out icase))
{
    Console.WriteLine("int: " + icase);
    return;
}
string scase;
if (foo.TrySecond(out scase))
{
    Console.WriteLine("string: " + scase);
    return;
}
// output:
// string: hello world!

Sasa.Either<*>.TrySecond

Sasa.Either<*>.TrySecond (2-param, 3-param, 4-param) method extracts the encapsulated value if the sum is of type T1:

Either<int, string> foo = "hello world!";
int icase;
if (foo.TryFirst(out icase))
{
    Console.WriteLine("int: " + icase);
    return;
}
string scase;
if (foo.TrySecond(out scase))
{
    Console.WriteLine("string: " + scase);
    return;
}
// output:
// string: hello world!

Sasa.Either<*>.TryThird

Sasa.Either<*>.TrySecond (3-param, 4-param) method extracts the encapsulated value if the sum is of type T2:

Either<int, string, decimal> foo = "hello world!";
int icase;
if (foo.TryFirst(out icase))
{
    Console.WriteLine("int: " + icase);
    return;
}
string scase;
if (foo.TrySecond(out scase))
{
    Console.WriteLine("string: " + scase);
    return;
}
string dcase;
if (foo.TryThird(out dcase))
{
    Console.WriteLine("decimal: " + dcase);
    return;
}
// output:
// string: hello world!

Sasa.Either<T0,T1,T2,T3>.TryFourth

Sasa.Either<*>.TryFourth method extracts the encapsulated value if the sum is of type T3:

Either<int, string, decimal, DateTime> foo = "hello world!";
int icase;
if (foo.TryFirst(out icase))
{
    Console.WriteLine("int: " + icase);
    return;
}
string scase;
if (foo.TrySecond(out scase))
{
    Console.WriteLine("string: " + scase);
    return;
}
string dcase;
if (foo.TryThird(out dcase))
{
    Console.WriteLine("decimal: " + dcase);
    return;
}
string tcase;
if (foo.TryFourth(out tcase))
{
    Console.WriteLine("DateTime: " + tcase);
    return;
}
// output:
// string: hello world!

As you can see from the above code samples, there are also implicit conversions for easy construction of sums given a value. Furthermore, equality is overridden and defined on the encapsulated values.

Sunday, May 5, 2013

Sasa.Operators<T> Overhaul - Now With More Generic Operator Goodness

Sasa.Operators<T> was covered in a previous post, and was useful in its own right, but was still somewhat limited in the operators it could expose. Since it abstracted only over a single type parameter T, it exposed those operators defined only on T. For instance, addition has signature "T add(T, T)", negation is "T negate(T)", and so on.

But not all operators are defined on only a single type. For instance, System.Decimal provides addition operators that work on integers, both signed and unsigned. Sasa.Operators<T> couldn't handle that. It can now.

Sasa.Operators is now a fully generic operator framework, exposing static generic class types Operators<T> as before, Operators<T0, T1> which exposes operators defined over two type parameters, like equals whose signature is "bool equals(T0, T1)", and finally, Operators<T0,T1,T2> for operators defined over three possible types, like addition "T2 add(T0, T1)". Furthermore, all overloadable operators are now accessible, including True/False and explicit and implicit conversions.

The delegates accessible via the above Operators classes are also now more efficient, and don't rely on LINQ expression compilation. This was all possible due to a new function I introduced under Sasa.Func called "Operator". It takes a delegate signature as a type parameter, and a Sasa.Operator value designating the operator type, and it searches for that static operator method that matches the delegate signature needed. Then it creates a direct delegate to that method of the given signature, so invocation via Operators is simply a direct virtual dispatch into a static method.

The only exception is for primitive types, like Int32 and Int64, because they don't provide static method operators. When the arguments are all primitives and no operator method is available, a small stub function is dynamically generated that implements the operation in efficient bytecode. Can't get much faster than this.

All of this was precipitated by my implementation of a LINQ expression interpreter in the Sasa.Linq assembly. You can now easily evaluate almost any LINQ expression with a single call:

var z = CLR.Eval(() => 3 * 2.0 + 1); // z = 7.0
var x = CLR.Eval(() => new Func<int, int>(z => z + 1)(3)); // x=4
...

This is in alpha status obviously, but it passes a few tests, and more will come. Obviously LINQ expressions already have compilation built-in, but sometimes dynamic compilation either isn't available, or is too costly to perform. For instance, consider the case of compiling a LINQ to SQL query. You don't want to dynamically generate code every time you want to simplify a LINQ expression. An interpreter is the right choice here due to its much smaller overhead.

Saturday, May 4, 2013

Circumventing C#'s Type Constraint Limitations

It's well known that C# doesn't permit certain types to appear as constraints, like System.Enum or System.Delegate, and that C# further prevents sealed classes from appearing as constraints. However, the example in the above article serves to point to an interesting circumvention of C#'s limitations on constraints.

Suppose we wish to provide a statically typed enum interface, similar to what I provide in my Sasa class library, but without having to resort to IL rewriting as I do. We can implement this by exploiting the fact that type constraints are inherited. Consider the following general base class:

public abstract class Constrained<TArg0, TReturn>
{
    public abstract T1 Apply<T0, T1>(T0 arg0)
        where T0 : TArg0
        where T1 : TReturn;
}

Notice how all the type parameters at the class level are fully abstract, so the C# compiler can't complain about using Enum or Delegate at this level. The class then constrains the type parameters at the abstract method level based on the class-level type constraints. An statically type-safe enum parser is then simply:

public class EnumParser : Constrained<string, Enum>
{
    public override T1 Apply<T0, T1>(T0 arg0)
    {
        return (T1)Enum.Parse(typeof(T1), arg0);
    }
}
...
var uint16 = new EnumParser().Apply<TypeCode, string>("UInt16");

Despite the fact that the method type parameters T0 and T1 look fully abstract, they inherit the constraints of T0 : string and T1 : Enum from the base class, and so the C# compiler knows they have the correct types. Of course, Enum.Parse returns System.Object, so we have to cast to return the correct enum type.

Since EnumParser has no state, you can cache it in a static field:

public static class Enums
{
  public static readonly EnumParser Parse = new EnumParser();
}
...
var uint16 = Enums.Parse.Do<TypeCode, string>("UInt16");

You can also define more specialized base classes other than Constrained to eliminate the need for two type arguments on Apply:

public abstract class ParseConstrained<TReturn>
{
    public abstract T Apply<T>(string arg0)
        where T : TReturn;
}
public class EnumParser : ParseConstrained<Enum>
{
    public override T Apply<T>(string arg0)
    {
        return (T)Enum.Parse(typeof(T), arg0);
    }
}
...
var uint16 = Enums.Parse.Do<TypeCode>("UInt16");

I'm currently exploring whether it's possible to define a truly reusable set of base classes like Constrained. There are two general problems here:

  1. A general set of base classes won't have type constraints between method parameters, ie. like T0 : T1, which are sometimes needed.
  2. Covariant type safety really requires that the constraint on the return type be TReturn : T1, not the contravariant constraint I specified in Constrained. You could do this with type constraints extension methods over Constrained, but you have to specify a lot of type parameters.

Ultimately, covariance/contravariance will probably partition the set of constrained function abstractions. For instance, a set of reusable function abstractions with type constraints would look something like this:

// covariant function types
public abstract class Covariant<TReturn>
{
    public abstract TReturn Apply();
}
public abstract class Covariant<TArg0, TReturn>
{
    public abstract TReturn Apply<T0>(T0 arg0)
        where T0 : TArg0;
}
public abstract class Covariant<TArg0, TArg1, TReturn>
{
    public abstract TReturn Apply<T0, T1>(T0 arg0, T1 arg1)
        where T0 : TArg0
        where T1 : TArg1;
}

// contravariant function types
public abstract class Contravariant<TReturn>
{
    public abstract T Apply<T>()
        where T : TReturn;
}
public abstract class Contravariant<TArg0, TReturn>
{
    public abstract T1 Apply<T0, T1>(T0 arg0)
        where T0 : TArg0
        where T1 :TReturn;
}
public abstract class Contravariant<TArg0, TArg1, TReturn>
{
    public abstract T2 Apply<T0, T1, T2>(T0 arg0, T1 arg1)
        where T0 : TArg0
        where T1 : TArg1
        where T2 : TReturn;
}

But even this isn't quite enough to properly type System.Enum.GetValues. GetValues is covariant but its method type argument has class-level type parameter dependency:

public abstract class TypeIndexedCovariant<TIndex, TReturn>
{
    public abstract TReturn Apply<T0>()
        where T0 : TIndex
        where TReturn : IEnumerable<T0>;
}

Of course, this isn't legal C# because you can't specify a class-type constraint on a method type parameter. I'm afraid these scenarios will require custom base class implementations for the near future. So we can implement a special base class to type System.Enum.GetValues like so:

public abstract class IndexedCovariant<TIndex>
{
    public abstract IEnumerable<T0> Apply<T0>()
        where T0 : TIndex;
}
class EnumValues : IndexedCovariant<Enum>
{
    public override IEnumerable<T0> Apply<T0>()
    {
        return Enum.GetValues(typeof(T0)) as T0[];
    }
}

In summary, a rewrite of my Sasa.Enums library in pure C# with Enum type constraints and no IL rewriting would look something like the following, assuming the above implementations for values and parsing:

public class EnumNames : IndexedCovariant<Enum, IEnumerable<string>>
{
    public override IEnumerable<string> Apply<T0>()
    {
        return Enum.GetNames(typeof(T0));
    }
}
// type-safe enum operations
public static class Enums
{
    public static readonly EnumParser Parse;
    public static readonly EnumValues Values;
    public static readonly EnumNames Names;
}
...
var names = Enums.Names.Apply<TypeCode>();
var values = Enums.Values.Apply<TypeCode>();
TypeCode parsed = Enums.Parse.Apply<TypeCode>("UInt16");

Not quite as pretty as the current approach in Sasa, but I could see a language on top of the CLR exploiting these sorts of base classes to support first-class functions with first-class polymorphism, ie. where the function argument constraints are just System.Object.

Edit: it's come to my attention that my explanations sometimes suck, so here's a TLDR version I posted to reddit:

The post was about exploring some options to enforce type constraints that C# normally forbids you from specifying, like constraints on Enum and Delegate in a way the C# compiler accepts and checks for you. This is done by encoding functions as objects and lifting the type constraints to class level type parameters, and enforcing method parameters to be subtypes of those parameters. Subclassing such an abstract type lets you specify Enum and Delegate constraints at the class level, and the method parameter constraints are inherited from the base class. Later in the post was a brief discussion of a few reusable abstractions for covariant and contravariant "function objects" of this form.
Hopefully that clarifies a little!