Higher Logics

The Fun of Floating Point Numbers in one Image

2022-03-18T10:03:00.005-04:00

Programming with floating point is always fun. Here's a nice little screen capture summarizing the insanity that sometimes arises:

.NET keeps 9 digits of precision internally, but typically only displays 7 digits of precision, so I had a hell of a time figuring out why a value from what's effectively a no-op was exceeding the 0.33F threshold I was looking for.

Losing equational reasoning is always fun, but this is even more bizarre than usual. Yay floating point!

Why we are likely NOT living in a simulation

2021-02-16T10:24:00.004-05:00

I typically keep this blog about computer science, but I also dabble in a bit of philosophy. I was initially struck by Bostrom's simulation argument when I first read it years ago. Over the years, I've cycled a few times between disputing what I believed were some of its assumptions, and cautious acceptance after realizing my mistake.

The simulation argument in its simplest form is that one of the following must be true:

simulations of humans can't be built, or
simulations of humans won't be built, or
we are almost certainly living in a simulation

I think this argument is absolutely valid, so one of those outcomes is true.

Claiming #3 is most likely is what's known as the simulation hypothesis, and has such proponents as Elon Musk. Sabine Hossenfelder recently argued against the simulation hypothesis by basically asserting that #1 above is plausible, but I actually think #2 is the most likely case. For reference, Bostrom calls future civilizations capable of running ancestor simulations "post-humans", so that's what I'll use.

I don't think the paper Sabine references demonstrates anything conclusive about the plausibility of the simulation hypothesis, because a) the model studied in the paper may bear no resemblence to how such a simulation might actually operate, and b) Bostrom's argument doesn't really depend on simulating physics.

I think (a) is self-explanatory, but for (b), the argument is as follows:

post-humans must understand how the human mind works up to and including consciousness in order to simulate a civilization of minds,
since the mind is understood, then the contents of a human mind are available to the simulation, including all senses, beliefs, etc.
since simulating physics is not the point of an ancestor simulation (just run a physics simulation in that case), an ancestor simulation will be about simulating human minds and their aggregate behaviour to simulated sensory inputs

Thus, a simulation merely needs to add the relevant sensory facts in our mind, ie. I sensed I just typed a key, stopped and took a bite of an apple, it doesn't actually have to physically simulate every step, including the apple with a bite taken out of it.

Also, our senses needn't even be consistent with the senses of other minds in the simulation. Eyewitness testimony is unreliable, for example, which demonstrates that we have a high tolerance for assertions of inconsistent observations in every domain except science.

This covers macroscopic phenomena we can directly sense, but what about actual physics experiments that test physics beyond our sensory capabilities? Since these experiments are all mediated by our senses as well, and since the simulation has access to the contents of our minds, including what physics properties we are currently testing, the simulation really just needs to present data that conforms to the probability distribution for the phenomena being tested.

Of course this may be sounding increasingly implausible, but it is still hypothetically possible for a post-human civilization, so case #1 can't be asserted without some monstrous loopholes. However, if we take a step back and ask ourselves why we simulate things, it seems plausible that post-humans simply won't have much interest in ancestor simulations.

In general, we run simulations when we don't have sufficiently precise or accurate understanding of certain problems. For instance, a closed form mathematical description of P negates any reason to simulate P.

Asserting that post-humans would run ancestor simulations is to simultaneously assert that post-humans won't have a satisfactory understanding of whatever problems a full ancestor simulation would answer. That by itself already seems less plausible, because a civilization capable of simulating a human mind with perfect fidelity would likely understand human behaviour.

However, it's still a possibility, since aggregate behaviour doesn't always tractably follow from microscopic behaviour, eg. we have a pretty good grasp of how individual molecules behave, but predicting the weather is still intractable because of a) the sheer number of interacting molecules, and b) equations governing individual particle motion become chaotic when combined to describe systems of multiple particles. However, chaotic systems still have some regular macroscopic behaviour which we study under "statistical mechanics".

Stochastic calculations obviate the need for direct simulation in most cases, so even if human behaviour is similarly chaotic in aggregate, there is likely a statistical mechanics for human behaviour that would similarly reduce or eliminate the need for direct simulation of conscious minds.

Thus, it's not obvious that post-humans would need much from ancestor simulations. I therefore find it fairly unlikely that post-humans would be interested in running such simulations considering the costly resources they would require, and thus, we likely are not living in a simulation.

Dual/Codual numbers for Forward/Reverse Automatic Differentiation

2020-05-29T11:19:00.013-04:00

In my last two posts on automatic differentiation (AD), I described some basic primitives that implement the standard approach to forward mode AD using dual numbers, and then a dual representation of dual numbers that can compute in reverse mode. I'm calling these "co-dual " numbers, as they are the categorical dual of dual numbers.

It didn't click at the time that this reverse mode representation seems to be novel. If it's not, please let me know! I haven't seen any equivalent of dual numbers capable of computing in reverse mode.

When reverse mode AD is needed, most introductions to AD go straight to building a graph/DAG representation of the computation in order to improve the sharing properties and run the computation backwards, but that isn't strictly necessary. I aim to show that there's a middle ground between dual numbers and the graph approach, even if it's only suitable for pedagogical purposes.

Review: Dual Numbers

Dual numbers augment standard numbers with an extra term for the derivative of that number:

real number x   =(dual number transform)=>   x + x'*ϵ

Then various identities over addition, multiplication and trig functions propagate the derivative throughout the function to the return value. Here is a simple type for dual numbers to illustrate:

public readonly struct Dual
{
    public readonly double Magnitude;

    public readonly double Derivative;
    
    public static Dual operator +(Dual lhs, Dual rhs) =>
        new Dual(lhs.Magnitude + rhs.Magnitude,
                 lhs.Derivative + rhs.Derivative);
        
    public static Dual operator *(Dual lhs, Dual rhs) =>
        new Dual(lhs.Magnitude * rhs.Magnitude,
                 lhs.Derivative * rhs.Magnitude + rhs.Derivative * lhs.Magnitude);
            
    public Dual Pow(int k) =>
        new Dual(Math.Pow(Magnitude, k),
                 k * Math.Pow(Magnitude, k - 1) * Derivative);
}

The standard calculation operates on the Magnitude field, and the Derivative field computes the derivative right alongside via the standard identities. This permits you to somewhat efficiently differentiate any function that accepts only values of type Dual, which is called forward mode AD:

public static class Calculus
{
    public static Dual DerivativeX0At(
        double x0, double x1, Func<Dual, Dual, Dual> f) =>
        f(new Dual(x0, 1), new Dual(x1, 0));

    public static Dual DerivativeX1At(
        double x0, double x1, Func<Dual, Dual, Dual> f) =>
        f(new Dual(x0, 0), new Dual(x1, 1));
}

We call a differentiable function with 1 in the derivative position of a parameter if we're differentiating with respect to that parameter. This means you can only take the derivative with respect to a single parameter at a time in forward mode. If you need to differentiate with respect to multiple input parameters, you need to run the whole function repeatedly, once for each derivative (or as I covered in my first post, use a vector of derivatives which has high space complexity).

This is obviously inefficient if you often need the derivatives of multiple parameters, which is very common in machine learning. Fortunately, reverse mode AD is a dual of forward mode, which exhibits the dual of this property: it can efficiently compute the derivative of all parameters simultaneously.

Codual Numbers

With a basic dual number representation, we can take its categorical dual and get a type with the complete opposite properties of the original. However, the categorical dual of the Derivative field is not obvious. So let's modify Dual slightly to to make the duality transformation more obvious. First, we replace the Derivative field as a value with it's equivalent function:

public readonly struct Dual
{
    public readonly double Magnitude;

    public readonly Func<double> Derivative;
    
    public static Dual operator +(Dual lhs, Dual rhs) =>
        new Dual(lhs.Magnitude + rhs.Magnitude,
                 () => lhs.Derivative() + rhs.Derivative());
        
    public static Dual operator *(Dual lhs, Dual rhs) =>
        new Dual(lhs.Magnitude * rhs.Magnitude,
                 () => lhs.Derivative() * rhs.Magnitude + rhs.Derivative() * lhs.Magnitude);

    public Dual Pow(int k) =>
        new Dual(Math.Pow(Magnitude, k),
                 () => k * Math.Pow(this.Magnitude, k - 1) * this.Derivative());
}

This doesn't really change the semantics of dual numbers, we essentially just added one level of indirection. But now that the Derivative is a function, the duality is obvious: the dual of a function that accepts nothing and returns a value is a function that accepts a value and returns nothing:

public readonly struct Codual
{
    public readonly double Magnitude;

    internal readonly Action<double> Derivative;
    
    public static Codual operator +(Codual lhs, Codual rhs) =>
        new Codual(lhs.Magnitude + rhs.Magnitude, dx =>
        {
            lhs.Derivative(dx);
            rhs.Derivative(dx);
        });
        
    public static Codual operator *(Codual lhs, Codual rhs) =>
        new Codual(lhs.Magnitude * rhs.Magnitude, dx =>
        {
            lhs.Derivative(dx * rhs.Magnitude);
            rhs.Derivative(dx * lhs.Magnitude);
        });
    
    public Codual Pow(int k)
    {
        var lhs = this;
        return new Codual(Math.Pow(Magnitude, k),
                          dx => lhs.Derivative(k * Math.Pow(lhs.Magnitude, k - 1) * dx));
    }
}

The operations on the magnitude remain unchanged, but the operations on the derivative have been turned inside out. Let's consider the multiplication identity on dual numbers to see how this works:

<x, x'> * <y, y'> = <x*y, y'*x + y*x'>

If we are differentiating with respect to x, then x' will have some value in forward mode, and y' will equal 0. This simplifies to:

<x, x'> * <y, 0> = <x*y, y*x'>

If we are differentiating with respect to y, then y' will have some value in forward mode, and x' will equal 0. This simplifies to:

<x, 0> * <y, y'> = <x*y, y'*x>

If we are taking the dual of these identities, then we need only propagate the dependencies: y' depends only on x, and x' depends only on y, and that's exactly what you see in the Codual multiplication operator:

public static Codual operator *(Codual x, Codual y) =>
    new Codual(x.Magnitude * y.Magnitude, dz =>
    {
        x.Derivative(dz * y.Magnitude);
        y.Derivative(dz * x.Magnitude);
    });

You can think of the Derivative field as building an execution trace which enables us to run the same computation that's happening on Magnitude, just backwards from outputs to inputs. Differentiating a function using the Codual representation works like this:

public static Result DerivativeAt(
    double x0, double x1, Func<Codual, Codual, Codual> f)
{
    double dx0 = 0, dx1 = 0;
    var y = f(new Codual(x0, dy => dx0 += dy),
              new Codual(x1, dy => dx1 += dy));
    y.Derivative(1);
    return new Result(y.Magnitude, dx0, dx1);
}

Similar to the dual number representation, we pass in x0 and x1 for the ordinary value calculation. Similar to the operators on Codual, we pass in functions for the derivative, but these ones are slightly different because they are the "leaves" of the execution trace which have to set the derivatives of the function's parameters. You can see that they operates as accumulators, summing the contributions to the derivative from every path through the executing function. If you only use a parameter once in your function, this leaf function is only executed once. If you use it N times, it executes N times.

Finally, like the dual number representation, we start the reverse derivative computation with a 1 as input, and the execution trace that was built propagates the value back to the input parameters.

Note that reverse mode AD is effectively a generalized form of backpropagation as used in machine learning. So if you've ever wanted to understand backpropagation, there it is in ~60 lines of code!

Summary

You can find a simple .NET automatic differentiation library that provides these types at this repo, and I've also uploaded a nuget package if you just want to play around with it. I wouldn't recommend using these abstractions for any sophisticated differentiation purposes, but they'll probably work fine for small applications and learning how AD works.

Edit: some earlier work noted that CPS style can be used for reverse mode AD, but likely for performance reasons, they went straight to using delimited continuations rather than implementing a co-dual number form like described here. So Codual is kinda-sorta novel, but probably not too surprising to a lot of people in this field.

Edit 2: Jules Jacobs noted some exponential blow up with some naive programs using Codual numbers, like loop(N) { x = x + x }. Dual/Codual numbers will probably never be as efficient as more sophisticated approaches to AD, but some simple optimizations can handle the low-hanging fruit to avoid exponential blowup of some naive expressions.

Easy Reverse Mode Automatic Differentiation in C#

2020-05-25T13:28:00.002-04:00

Continuing from my last post on implementing forward-mode automatic differentiation (AD) using C# operator overloading, this is just a quick follow-up showing how easy reverse mode is to achieve, and why it's important.

Why Reverse Mode Automatic Differentiation?

As explained in the last post, the vector representation of forward-mode AD can compute the derivatives of all parameter simultaneously, but it does so with considerable space cost: each operation creates a vector computing the derivative of each parameter. So N parameters with M operations would allocation O(N*M) space. It turns out, this is unnecessary!

Reverse mode AD allocates only O(N+M) space to compute the derivatives of N parameters across M operations. In general, forward mode AD is best suited to differentiating functions of type:

R → R^N

That is, functions of 1 parameter that compute multiple outputs. Reverse mode AD is suited to the dual scenario:

R^N → R

That is, functions of many parameters that return a single real number. A lot of problems are better suited to reverse mode AD, and some modern machine learning frameworks now employ reverse mode AD internally (thousands of parameters, single output that's compared to a goal).

How does Reverse Mode Work?

The identities I described in the other article still apply since they're simply the chain rule, but reverse mode computes derivatives backwards. Forward-mode AD is easy to implement using dual numbers in which the evaluation order matches C#'s normal evaluation order: just compute a second number corresponding to the derivative along side the normal computation. Since reverse mode runs backwards, we have to do the computational dual: build a (restricted) continuation!

This is a rough sketch showcasing both forward mode and reverse mode and how they're duals. Forward mode AD using dual numbers will look something like this:

public readonly struct Fwd
{
    public readonly double Magnitude;
    public readonly double Derivative;

    public Fwd(double mag, double deriv)
    {
        this.Magnitude = mag;
        this.Derivative = deriv;
    }

    public Fwd Pow(int k) =>
        new Fwd(Math.Pow(Magnitude, k), k * Math.Pow(Magnitude, k - 1) * Derivative);

    public static Fwd operator +(Fwd lhs, Fwd rhs) =>
        new Fwd(lhs.Magnitude + rhs.Magnitude, lhs.Derivative + rhs.Derivative);

    public static Fwd operator *(Fwd lhs, Fwd rhs) =>
        new Fwd(lhs.Magnitude + rhs.Magnitude,
                lhs.Derivative * rhs.Magnitude + rhs.Derivative * lhs.Magnitude);

    public static Func<double, Fwd> Differentiate(Func<Fwd, Fwd> f) =>
        x => f(new Fwd(x, 1));

    public static Func<double, double, Fwd> DifferentiateX0(Func<Fwd, Fwd, Fwd> f) =>
        (x0, x1) => f(new Fwd(x0, 1), new Fwd(x1, 0));

    public static Func<double, double, Fwd> DifferentiateX1(Func<Fwd, Fwd, Fwd> f) =>
        (x0, x1) => f(new Fwd(x0, 0), new Fwd(x1, 1));
}

Translating this into reverse mode entails replacing Fwd.Derivative with a continuation like so:


public readonly struct Rev
{
    public readonly double Magnitude;
    readonly Action<double> Derivative;

    public Rev(double y, Action<double> dy)
    {
        this.Magnitude = y;
        this.Derivative = dy;
    }

    public Rev Pow(int e)
    {
        var x = Magnitude;
        var k = Derivative;
        return new Rev(Math.Pow(Magnitude, e),
                       dx => k(e * Math.Pow(x, e - 1) * dx));
    }

    public static Rev operator +(Rev lhs, Rev rhs) =>
        new Rev(lhs.Magnitude + rhs.Magnitude, dx =>
        {
            lhs.Derivative(dx);
            rhs.Derivative(dx);
        });

    public static Rev operator *(Rev lhs, Rev rhs) =>
        new Rev(lhs.Magnitude * rhs.Magnitude,
                dx =>
                {
                    lhs.Derivative(dx * rhs.Magnitude);
                    rhs.Derivative(dx * lhs.Magnitude);
                });

    public static Func<double, (double, double)> Differentiate(Func<Rev, Rev> f) =>
        x =>
        {
            double dx = 0;
            var y = f(new Rev(x, dy => dx += dy));
            y.Derivative(1);
            return (y.Magnitude, dx);
        };

    public static Func<double, double, (double, double, double)> Differentiate(Func<Rev, Rev, Rev> f) =>
        (x0, x1) =>
        {
            double dx0 = 0, dx1 = 0;
            var y = f(new Rev(x0, dy => dx0 += dy),
                      new Rev(x1, dy => dx1 += dy));
            y.Derivative(1);
            return (y.Magnitude, dx0, dx1);
        };

    public static Func<double, double, double, (double, double, double, double)> Differentiate(Func<Rev, Rev, Rev, Rev> f) =>
        (x0, x1, x2) =>
        {
            double dx0 = 0, dx1 = 0, dx2 = 0;
            var y = f(new Rev(x0, dy => dx0 += dy),
                      new Rev(x1, dy => dx1 += dy),
                      new Rev(x2, dy => dx2 += dy));
            y.Derivative(1);
            return (y.Magnitude, dx0, dx1, dx2);
        };
}

As I mentioned in my last post, my goal here isn't the most efficient implementation for reverse mode AD, but to distill its essence to make it direct and understandable. This representation builds a whole new continuation on every invocation of the function being differentiated. More efficient representations would only compute this continuation once for any number of invocations, and there are plenty of other optimizations that can be applied to both forward and reverse mode representations.

Easy Automatic Differentiation in C#

2020-05-19T08:23:00.001-04:00

I've recently been researching optimization and automatic differentiation (AD), and decided to take a crack at distilling its essence in C#.

Note that automatic differentiation (AD) is different than numerical differentiation. Math.NET already provides excellent support for numerical differentiation.

C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries. Neither of these are suitable for learning how AD works.

So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique.

What is Automatic Differentiation?

Simply put, automatic differentiation is a technique for calculating the derivative of a numerical function with roughly constant time and space overhead relative to computing the function normally.

AD is typically preferred over numerical differentiation as the latter introduces imprecision due to the approximations employed and the limits of floating point precision inherent to our number types. AD is also often preferred to symbolic differentiation, as the latter can sometimes yield large and awkward terms which results in inefficient execution times, without any appreciable increase in precision.

Differentiation is critical to all sorts of optimization problems in many domains, so this is quite a useful technique!

Automatic Differentiation Basics

The most straightforward implemenation of AD is based on dual numbers, where each regular number is augmented with an extra term corresponding to it's derivative:

real number x   =(dual number transform)=>   x + x'*ϵ

Arithmetic and other mathematical functions then have translations to operating on these extended number types as follows:

Operator	Translated
<x, x'> + <y, y'>	<x + y, x' + y'>
<x, x'> - <y, y'>	<x - y, x' - y'>
<x, x'> * <y, y'>	<xy, y'x - y*x'>
<x, x'> / <y, y'>	<x / y, (x'y - xy')/y²>
<x, x'>^k	<x^k, kx^(k-1)x'>

See the wikipedia page for more transformations, like standard trig functions.

Dual Numbers from Operator Overloading

Invoking a function "f" with dual numbers operates like this, in math notation:

f(x0 + ϵ_x1, x1 + ϵ_x2, x2 + ϵ_x2)

Each parameter carries an extra 'ϵ' value corresponding to the derivative, and this extra value is distinct from the 'ϵ' values of all other parameters. However, as you can see in the translation table, these derivatives interact with one another in some operators, so a general number type carries a vector for the coefficients of all other parameters. Here's the basic number type:

public readonly struct Number
{
    public readonly double Magnitude;
    public readonly double[] Derivatives;

    internal Number(double m, params double[] d)
    {
        this.Magnitude = m;
        this.Derivatives = d;
    }
}

A differentiable function of 3 parameters would have this signature:

Number Function(Number x0, Number x1, Number x2)

Internally, differentiation invokes the function like this:

public static Number DifferentiateAt(
    double x0, double x1, double x2, Func<Number, Number, Number, Number> func) =>
    func(new Number(x0, 1, 0, 0),
         new Number(x1, 0, 1, 0),
         new Number(x2, 0, 0, 1));

Each parameter is initialized with zeroes everywhere except for a 1 in the derivative slot corresponding to its position in the argument list. This is the necessary starting configuration for automatic differentiation in order to compute the derivatives for any of the parameters.

Operators are now pretty straightforward, corresponding to operations on the magnitude and pairwise operations on each index of the array:

public static Number operator +(Number lhs, Number rhs) =>
    new Number(lhs.Magnitude + rhs.Magnitude,
               lhs.Derivatives + rhs.Derivatives);

public static Number operator *(Number lhs, Number rhs) =>
    new Number(lhs.Magnitude * rhs.Magnitude,
               lhs.Derivatives * rhs.Magnitude + rhs.Derivatives * lhs.Magnitude);

Obviously you can't add or multiply two double[] like I've shown here, but the actual implementation hides the array behind a Derivatives type that exposes the arithmetic operators:

public static Derivatives operator +(Derivatives lhs, Derivatives rhs)
{
    var v = new double[lhs.Count];
    for (int i = 0; i < lhs.vector.Length; ++i)
        v[i] = lhs.vector[i] + rhs.vector[i];
    return new Derivatives(v);
}

public static Derivatives operator *(Derivatives lhs, Derivatives rhs)
{
    var v = new double[lhs.Count];
    for (int i = 0; i < lhs.vector.Length; ++i)
        v[i] = lhs.vector[i] * rhs.vector[i];
    return new Derivatives(v);
}

Once you have your return value of type Number, you can access the derivatives of each parameter by its index:

var y = Calculus.DifferentiateAt(x0, x1, function);
Console.WriteLine("x0' = " + y.Derivatives[0]);
Console.WriteLine("x1' = " + y.Derivatives[1]);

Hopefully, that's clear enough for most people to understand the magic behind automatic differentiation. It's an incredibly useful tool that I'm only now exploring, and it deserves to be more widely known!

Optimizations

Most presentations of automatic differentiation show examples where you differentiate a function with respect to only a single parameter, but the technique described above computes every derivative simultaneously. Obviously that's more general, but you typically don't need all of the derivatives which makes this technique a little wasteful.

So as a first optimization, review the starting configuration for automatic differentiation described above and consider what happens when you're interested in the derivative of x0 only:

public static Number Differentiate_X0(
    double x0, double x1, double x2,
    Func<Number, Number, Number, Number> func) =>
    func(new Number(x0, 1, 0, 0),
         new Number(x1, 0, 0, 0),
         new Number(x2, 0, 0, 0));

When you only want one of the derivatives, the ϵ coefficient of all other parameters would be zero, and all of those array slots filled with zeroes would stay zero throughout the whole computation. So toss them out!

Create a specialized Number type that doesn't incur any array allocations at all by replacing Derivatives with a single System.Double corresponding to the one parameter that's being differentiated. That parameter gets a 1 as the extra term when differentiating, the rest all start with 0.

So while you can only differentiate with respect to one variable at a time with this more specialized Number type, you only need to carry around an extra double for each step in the calculation. This would be very space and time efficient!

Simplest Exceptions Handler Macros for C

2019-09-21T14:25:00.002-04:00

More adventures in C, I revisited my exception handler implementation libex. I realized there's an even simpler version that's more efficient if we give up the requirement that exceptions can be propagated to callers:

#define TRY
#define THROW(E) goto E
#define CATCH(E) goto FINALLY_H; E:
#define FINALLY FINALLY_H:

This implementation requires only a single exception handling block per function, which is probably good idea as a general rule. Usage looks like:

static void foo(size_t sz) {
  char* x;
  TRY {
    x = (char*)malloc(sz);
    if (x == NULL) THROW(ENoMem);
    // do something with x
  } CATCH(ENoMem) {
    // handle out of memory
  } FINALLY {
    if (x != NULL) free(x);
  }
}

Unlike libex, this version is actually compatible with break and continue, although using them runs the risk of skipping the FINALLY handler. An almost equally simple version of exception handlers which ensures that break/continue cannot skip the FINALLY block wraps everything in a loop:

#define TRY do {
#define THROW(E) goto E
#define CATCH(E) goto FINALLY_H; E:
#define FINALLY } while(0); FINALLY_H:

This ensures that any break or continue statements executed within the TRY or CATCH blocks immediately exit to the FINALLY handler.

Of course, abort() and exit() invocations will also bypass FINALLY, so if you intend to use these macros, I suggest using them only as a way to organize your code in a more structured fashion.

async.h - asynchronous, stackless subroutines in C

2019-09-20T10:47:00.001-04:00

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C!

Features:

It's 100% portable C.
It requires very little state (2 bytes).
It's not dependent on an OS.
It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved.

#include "async.h"

struct async pt;
struct timer timer;

async example(struct async *pt) {
    async_begin(pt);
    
    while(1) {
        if(initiate_io()) {
            timer_start(&timer);
            await(io_completed() || timer_expired(&timer));
            read_data();
        }
    }
    async_end;
}

This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground breaking. I've made a few tweaks that make it more understandable and generate more compact code, and I also think it more cleanly maps to the async/await semantics than it does to true threading.

Protothreads and async.h are both based around local continuations, but where protothreads are callee-saved, async.h is caller-saved. This eliminates the need to pass in the local continuation to any async operations except async_begin. This simplifies the macros that implement the async/await idiom, and even simplifies code that uses async.h.

Here's a simple example of fork-join style "parallelism":

#include "async.h"

typedef struct { 
    async_state;
    struct async nested1;
    struct async nested2;
} example_state;
example_state pt;

async nested(struct async *pt){
    async_begin(pt);
    ...
    async_end;
}

async example(example_state *pt) {
    async_begin(pt);

    // fork two nested async subroutines and wait until both complete
    async_init(&pt->nested1);
    async_init(&pt->nested2);
    await(async_call(nested, &pt->nested1) & async_call(nested, &pt->nested2));
    
    // fork two nested async subroutines and wait until at least one completes
    async_init(&pt->nested1);
    async_init(&pt->nested2);
    await(async_call(nested, &pt->nested1) | async_call(nested, &pt->nested2));

    async_end;
}

Building a Query DSL in C#

2019-09-03T06:54:00.000-04:00

I recently built a REST API prototype where one of the endpoints accepted a string representing a filter to apply to a set of results. For instance, for entities with named properties "Foo" and "Bar", a string like "(Foo = 'some string') or (Bar > 99)" would filter out the results where either Bar is less than or equal to 99, or Foo is not "some string".

This would translate pretty straightforwardly into a SQL query, but as a masochist I was set on using Google Datastore as the backend, which unfortunately has a limited filtering API:

It does not support disjunctions, ie. "OR" clauses.
It does not support filtering using inequalities on more than one property.
It does not support a not-equal operation.

So in this post, I will describe the design which achieves the following goals:

A backend-agnostic querying API supporting arbitrary clauses, conjunctions ("AND"), and disjunctions ("OR").
Implementations of this backend that are amenable to unit testing the parser and the query logic.
A simple implementation of this backend for Google Datastore supporting the full query capabilities.

The Query Language

The query language is supposed to support the following filtering operations:

Clauses on properties should support: =, !=, <, >.
Supported data types are: numbers, dates, strings.
Clauses can be combined by "and" and "or" operators.
Precedence must be specified by parenthesis and compound expressions, ie. "(expression)" and "(expression) and (expression)" are valid, but "expression and expression" is not.

Anyone who follows me probably knows I'm a fan of tagless interpreters, also known as object algebras, so I started by defining the query language in this fashion:

public interface IQueryLanguage<T>
{
    T Id();
    T Const(string val);
    T Const(DateTime date);
    T Const(double num);
    T Property(string propertyName);
    T And(T lhs, T rhs);
    T Or(T lhs, T rhs);
    T LessThan(T lhs, T rhs);
    T GreaterThan(T lhs, T rhs);
    T Equal(T lhs, T rhs);
    T NotEqual(T lhs, T rhs);
}

You can clearly see the language structure here and how queries can be composed into larger compound expressions. Another good practice is to factor all backend operations into a repository, so this is roughly how it might look:

public interface IRepository<T>
{
   ...

   IQueryLanguage<T> CreateQuery<TEntity>()
        where TEntity : class;

   Task<IEnumerable<TEntity>> Execute<TEntity>(T query, int limit, int page)
        where TEntity : class;
}

So our repository instances will expose a method to create a query for an entity type, and a method used to execute that query with some extra parameters for paging.

The grammar for the language looks roughly like this in pseudo-EBNF:

number := [0-9]+[.]?[0-9]*
quoted := "'" datetime "'" | "'" char* "'"
e      := number | quoted | (e) | (e) "or" (e) | (e) "and" (e)

Now that we have a general interface and a grammar for the query language, we can build a simple and generic recursive descent parser against that interface to generically construct a query:

public static class QueryParser
{
    public static T Parse<T>(this IQueryLanguage<T> provider, string input)
    {
        if (string.IsNullOrEmpty(input))
            return provider.Id();
        int pos = 0;
        var e = provider.Parse(input, ref pos);
        if (pos < input.Length)
            throw new ArgumentException("Malformed query");
        return e;
    }

    static T Parse<T>(this IQueryLanguage<T> provider, string input, ref int pos)
    {
        var last = pos;
        if (Match(input, "(", ref pos))
        {
            var e = provider.Parse(input, ref pos);
            return !Match(input, ")", ref pos) ? Fail<T>("expected ')'", pos):
                    pos < input.Length         ? provider.Operation(e, input, ref pos):
                                                 e;
        }
        else
        {
            return provider.Operation(provider.Terminal(input, ref pos), input, ref pos);
        }
    }

    static T Operation<T>(this IQueryLanguage<T> provider, T lhs, string input, ref int pos)
    {
        Skip(input, ref pos);
        var last = pos;
        if (Match(input, "=", ref pos))
        {
            return provider.Equal(lhs, provider.Terminal(input, ref pos));
        }
        else if (Match(input, ">", ref pos))
        {
            return provider.GreaterThan(lhs, provider.Terminal(input, ref pos));
        }
        else if (Match(input, "!=", ref pos))
        {
            return provider.NotEqual(lhs, provider.Terminal(input, ref pos));
        }
        else if (Match(input, "<", ref pos))
        {
            return provider.LessThan(lhs, provider.Terminal(input, ref pos));
        }
        else if (Match(input, "and ", ref pos))
        {
            return provider.And(lhs, provider.Parse(input, ref pos));
        }
        else if (Match(input, "or ", ref pos))
        {
            return provider.Or(lhs, provider.Parse(input, ref pos));
        }
        else if (Match(input, ")", ref pos))
        {
            --pos;
            return lhs;
        }
        else
        {
            return Fail<T>("Unrecognized token", pos);
        }
    }

    static T Terminal<T>(this IQueryLanguage<T> provider, string input, ref int pos)
    {
        Skip(input, ref pos);
        var last = pos;
        if (Match(input, "'", ref pos))
        {
            var end = input.IndexOf('\'', pos);
            if (end < 0)
                return Fail<T>("Unrecognized value", pos);
            else if (DateTime.TryParse(input.AsSpan(pos, end - pos), out var date))
            {
                pos = end + 1;
                return provider.Const(DateTime.SpecifyKind(date, DateTimeKind.Utc));
            }
            else
            {
                var x = input.Substring(pos, end - pos);
                pos = end + 1;
                return provider.Const(x);
            }
        }
        else if (Match(input, (x, i) => x[i] == '.' || char.IsDigit(x, i), ref pos))
        {
            return double.TryParse(input.AsSpan(last, pos - last), out var i)
                ? provider.Const(i)
                : Fail<T>("Expected integer", last);
        }
        else if (Match(input, char.IsLetter, ref pos))
        {
            return provider.Property(input.Substring(last, pos - last));
        }
        else
        {
            return Fail<T>("Unrecognized token", pos);
        }
    }

    static T Fail<T>(string error, int pos) =>
        throw new ArgumentException($"Parse error @ {pos}: {error}");

    static void Skip(string input, ref int pos)
    {
        while (pos < input.Length && char.IsWhiteSpace(input[pos]))
            ++pos;
    }

    static bool Match(string input, ReadOnlySpan<char> expect, ref int pos)
    {
        if (pos + expect.Length > input.Length ||
           !input.AsSpan(pos, expect.Length).Equals(expect, StringComparison.OrdinalIgnoreCase))
            return false;
        pos += expect.Length;
        return true;
    }

    static bool Match(string input, Func<string, int, bool> expect, ref int pos)
    {
        var start = pos;
        while (pos < input.Length && expect(input, pos))
            ++pos;
        return start < pos;
    }
}

This isn't the most efficient implementation, but it's hopefully understandable since the parser doesn't implement operator precedence and doesn't really do any backtracking. It recursively descends into a bracketed expression, moving a position pointer along the string as it matches characters, and then invokes any query language methods that matched.

The important part is that this parser is agnostic to the query language implementation, so we can reuse it for any query implementation like so:

IRepository<T> repo = ...;
...
var query = repo.CreateQuery().Parse("(Foo = 'some string') or (Bar > 99)");
var results = repo.Execute(query, limit: 10, page: 1);

The Datastore Implementation

Since the Datastore filtering API has the limitations described above, we need to transform the query into a form that we can actually execute. Here is an outline of the choices I made, which I will then explain in more detail:

No disjunctions: translate a compound query into disjunctive normal form. This consists of a list of queries with no "OR" operators that are executed individually, and whose results are then merged.
No inequalities on more than one property: keep track of the property that was referenced in an inequality, and build a parallel filter using delegates. We can then execute one inequality filter server-side and post-process the results with the delegates to ensure all inequalities are satisfied.
No != operation: translate "x != y" into "(x < y) or (x > y)". All of the supported types have strict orderings, so not-equal is equivalent to a compound inequality defined by greater-than or less-than.

So let's start with a basic filter in our query language:

public struct QueryFilter
{
    // the datastore filter query
    public Google.Cloud.Datastore.V1.Filter Filter { get; set; }

    // the property referenced by any inequality test; null if no inequality test
    public string InequalityProp { get; set; }
    public bool HasInequality => InequalityProp != null;

    // post-processing filter that applies any extra inequalities
    public Func<Google.Cloud.Datastore.V1.Entity, bool> Post { get; set; }
    public QueryFilter(Filter f, string inequalityProp, Func<Google.Cloud.Datastore.V1.Entity, bool> post)
    {
        Filter = f;
        InequalityProp = inequalityProp;
        Post = post;
    }
}

This groups the Datastore filter, the post-processing filter, and the property used for inequality tests (if any). So a simple property test like "Foo = 'hello world!" would generate:


var foo = new QueryFilter(Filter.Equal("Foo", "hello world!"), null,
                          e => CompareTo(e["Foo"], "hello world!") == 0);

The other basic clauses look similar, just expecting different results from CompareTo. The post-processing filter need not be used for basic queries, but compound queries with multiple inequalities need this post-processing filter since only one inequality filter will be applied server-side.

The CompareTo method implements the idiomatic IComparablefor Datastore values:


static int CompareTo(Value x, Value y)
{
    if (x.ValueTypeCase != y.ValueTypeCase)
        throw new InvalidCastException(
            $"Values are not of the same type: {x.ValueTypeCase} != {y.ValueTypeCase}.");
    switch (x.ValueTypeCase)
    {
        case Value.ValueTypeOneofCase.DoubleValue:
            return x.DoubleValue.CompareTo(y.DoubleValue);
        case Value.ValueTypeOneofCase.StringValue:
            return x.StringValue.CompareTo(y.StringValue);
        case Value.ValueTypeOneofCase.TimestampValue:
            return x.TimestampValue.ToDateTime().CompareTo(y.TimestampValue.ToDateTime());
        default:
            throw new InvalidCastException($"Can't compare type: {x.ValueTypeCase}");
    }
}

QueryFilter suffices for basic queries that Google Datastore already supports, but it's insufficient for "OR" queries. So we need another level of indirection to keep track of lists of QueryFilter that represent the query in disjunctive-normal form:

public struct Term
{
    public Term(IEnumerable<QueryFilter> filter) : this()
    {
        Filters = filter;
    }
    public Term(params QueryFilter[] filter) : this()
    {
        Filters = filter;
    }
    public Term(Value value) : this()
    {
        Value = value;
    }
    public Term(string prop) : this()
    {
        Property = prop;
    }
    public IEnumerable<QueryFilter> Filters { get; private set; }
    public Value Value { get; private set; }
    public string Property { get; private set; }
}

These are the real terms generated by our query language. In reality, this should be a set of classes representing the sum term = property | value | query, but it's simpler to keep everything in a struct to avoid casts. The list of QueryFilter values represents the list of queries that themselves contain no "OR" clauses. Implementations of our query combinators must preserve this property, and we'll see later on that it's pretty straightforward to do this. Let's start with the basic value constructors:

public struct DatastoreQueries : IQueryLanguage<Term>
{
    public Term Id() => new Term();

    public Term Const(DateTime date) =>
        new Term(date);

    public Term Const(double num) =>
        new Term(num);

    public Term Const(string val) =>
        new Term((Google.Cloud.Datastore.V1.Value)val);

    public Term Property(string propertyName) =>
        new Term(propertyName);
    ...
}

Pretty straightforward, values are converted into instances of Google.Cloud.Datastore.V1.Value, string property names are tracked as properties. Now let's look at the basic filters:

public struct DatastoreQueries : IQueryLanguage<Term>
{
    ...

    public Term Equal(Term lhs, Term rhs)
    {
        if (lhs.Filters != null || rhs.Filters != null)
            throw new ArgumentException("Operations can only compare properties against values.");
        if (rhs.Property != null && lhs.Property == null)
            return Equal(rhs, lhs);
        if (lhs.Property == null || rhs.Property != null)
            throw new ArgumentException("Operations can only compare properties against values, not two properties.");
        var filter = Filter.Equal(lhs.Property, rhs.Value);
        return new Term(new QueryFilter(filter, null, e => CompareTo(e[lhs.Property], rhs.Value) == 0));
    }
}

Here we ensure that the terms being compared for equality are a property and a value, and we also ensure that the property name is on the left hand side so it's easier to operate on. Equality is symmetric, so we can just reverse the order of the parameter and the meaning is preserved. This is not quite the case for inequality checks, but two inequality operators can be applied in terms of each other:

public struct DatastoreQueries : IQueryLanguage<Term>
{
    ...

    public Term LessThan(Term lhs, Term rhs)
    {
        // insert validations as above
        return
        rhs.Property != null ? GreaterThan(rhs, lhs):
        lhs.Property != null ? new Term(new QueryFilter(Filter.LessThan(lhs.Property, rhs.Value), lhs.Property,
                                                        e => CompareTo(e[p], v) < 0):
                               throw new ArgumentException("A property is required for comparisons.");
    }
    public Term GreaterThan(Term lhs, Term rhs) =>
    {
        // insert validations as above
        return
        rhs.Property != null ? LessThan(rhs, lhs):
        lhs.Property != null ? new Term(new QueryFilter(Filter.GreaterThan(lhs.Property, rhs.Value), lhs.Property,
                                                        e => CompareTo(e[p], v) > 0):
                               throw new ArgumentException("A property is required for comparisons.");
    }
}

I've omitted the validations for brevity, but the basic idea is the same. We ensure the term property is on the left-hand side by transforming greater-than and less-than into each other if they're backwards. As before, we also create a Datastore filter and a post-processing filter for the same operations. And now we get to the composite query combinators. The "OR" combinator is very simple:

public struct DatastoreQueries : IQueryLanguage<Term>
{
    ...

    public Term Or(Term lhs, Term rhs) =>
        new Term(lhs.Filters.Concat(rhs.Filters));
}

OR-ing two queries that themselves lack "OR" clauses is simply merging them into a single list. Given two lists of queries each element of which contain no "OR" clauses is then simply concatenating the two lists. This preserves the invariant that each individual sub-query contains no "OR" clauses. This leaves only the AND combinator:

public struct DatastoreQueries : IQueryLanguage<Term>
{
    ...

    public Term And(Term lhs, Term rhs)
    {
        var permutations =
        from x in lhs.Filters
        from y in rhs.Filters
        select !x.HasInequality || !y.HasInequality || x.InequalityProp.Equals(y.InequalityProp)
                                 ? new QFilter(Filter.And(x.Filter, y.Filter), x.InequalityProp, e => x.Post(e) && y.Post(e)):
                x.HasInequality  ? new QFilter(x.Filter, x.InequalityProp, e => x.Post(e) && y.Post(e)):
                                   new QFilter(y.Filter, y.InequalityProp, e => x.Post(e) && y.Post(e));
        return new Term(permutations);
    }
}

There's a lot going on here, but here's the gist: every filter in the lhs is AND-ed with every filter in the rhs. The various cases you see being checked are to handle inequalities in either the lhs or the rhs. If neither side contains an inequality filter, or if they refer to the same property, then we can combine the queries as a server-side filter. Otherwise, we take one of the filters with an inequality and post-process the other one. Technically one branch is redundant here, I'm just showing it for completeness. I also execute the post-processing for all branches for safety while I was prototyping and testing (see below).

The reason we can just permute the individual lhs and rhs expression this way is because boolean algebra is distributive. That means, given an expression in disjunctive normal form (a series of OR-ed expressions), if you then apply an AND-ed clause to that whole expression, that's equivalent to applying AND to each individual clause:

A and (B or C or D or ...)
= A and B or A and C or A and D or A and ...

This is exactly what we're doing above: we're tracking a set of OR-ed clauses and then combining them with another expression via AND, which amounts to distributing the AND-ed expression across the list. But what if A itself is an OR-ed expression? Just do it all over again:

A = (X or Y or Z or ...)
A and B or A and C or ...
= (X or Y or Z or ...) and B or (X or Y or Z or ...) and C or ...                         
= (X and B or Y and B or Z and B or ...) or (X and C or Y and C or Z and C or ...) or ...

Every element of A and each element of the original "OR" expression were permuted to produce the final expression, still in disjunctive normal form. This is precisely what the LINQ expression above is doing, with some extra logic to handle inequalities.

We can also see here the necessity of carrying around the post-processing filters through all the terms. When an inequality is applied to two different terms in the same filter, only one of them can be applied server-side, and the remainder has to be applied as a delegate filter when we get the results back.

Let's see how executing a query happens in a repository implementation:

public class DatastoreRepository : IRepository<Term>
{
    DatastoreDb db;
    ...
    public async Task<IEnumerable<Entity>> Execute<TEntity>(Term query, int limit, int page)
        where TEntity : class
    {
        if (page <= 0)
            throw new ArgumentOutOfRangeException(nameof(page),
                "Page number must be greater than or equal to 1.");
        if (limit <= 0)
            throw new ArgumentOutOfRangeException(nameof(limit),
                "Number of results per page must be greater than or equal to 1.");

        // execute each OR-free filter server-side, then apply the post-filter to ensure all
        // inequalities are satisfied, then merge all results into a sorted set to remove duplicates
        // and ensure results are always in the same order for paging purposes
        var results = new SortedSet<Entity>(ecompare);
        foreach (var f in query.Filters ?? Enumerable.Repeat(new QueryFilter(), 1))
        {
            // generate the entity kind from type
            var q = new Query(Kind<TEntity>())
            {
                Filter = f.Filter,
            };
            var r = await db.RunQueryAsync(q);
            var filtered = f.Post != null
                ? r.Entities.Where(f.Post)
                : r.Entities;
            foreach(var x in filtered)
                results.Add(x);
        }
        // rough paging implementation that's not particularly efficient, but suffices for now
        return results.Skip((page - 1) * limit).Take(limit);
    }
    
    static readonly EntityComparer ecompare = new EntityComparer();

    sealed class EntityComparer: IComparer<Entity>
    {
        public int Compare(Entity x, Entity y) =>
            Id(x.Key).CompareTo(Id(y.Key));
    }
}

The gist is simply that each QueryFilter is executed individually, then the post-processing filter is applied to ensure that all remaining inequalities are satisfied, and the results from all individual OR-ed queries are merged into a single sorted set. This set is then paged through using the given paging parameters and the limited results returned.

This is probably among the simplest possible implementations for the approach I took, and so is not particularly efficient, but it suffices for prototyping purposes or programs whose filters would return moderately sized result sets.

Possible Extensions

If you've ever wondered what a query planner in a relational database does, you can start with the above naive implementation and consider:

what choices you can make about which inequality to preserve in order to make the server-side query return the smallest data set, and what sort of metadata you might need to track in order to make such choices
stricter clauses can eliminate other clauses, ie. "(Foo < 'hello') or (Foo = 'hell')" => "Foo = 'hell'
a more complex equality query might sometimes be preferable to a simpler inequality query, so sometimes the equality query should be executed server-side instead of the inequality
whether you can exploit the paging parameters to limit the results of each OR query and interleave them somehow
...and so on...

There's also a way to make the parsing and query construction less error-prone. I currently use dynamic checks where I could make a new type distinction that would eliminate these errors statically. Consider the following interface for the query language and see if you can modify the query parser to accommodate this new distinction and what sorts of errors it statically prevents:

public interface IQueryLanguage<TClause, TQuery>
{
    TClause Id();
    TClause Const(string val);
    TClause Const(DateTime date);
    TClause Const(double num);
    TClause Property(string propertyName);
    TClause LessThan(TClause lhs, TClause rhs);
    TClause GreaterThan(TClause lhs, TClause rhs);
    TClause Equal(TClause lhs, TClause rhs);
    TClause NotEqual(TClause lhs, TClause rhs);

    TQuery Lift(TClause clause);
    TQuery And(TQuery lhs, TQuery rhs);
    TQuery Or(TQuery lhs, TQuery rhs);
}

Finally, my actual prototype also passes in a delegate that compares the property name passed in against a type-specific whitelist of allowed property names, so this too would be a simple extension to what I've presented.

Testing

The abstract interfaces used here make testing pretty straightforward. Here's a simple implementation of a pretty-printer for parsed expressions:

public delegate StringBuilder Eval(StringBuilder buf);

public class PrettyPrinter : IQueryLanguage<Eval>
{
    public Eval Id() => buf => buf;

    public Eval Const(DateTime date) =>
        buf => buf.Append('\'').AppendFormat("{0:yyyy-MM-dd}", date).Append('\'');

    public Eval Const(double num) =>
        buf => buf.Append(num);

    public Eval Const(string val) =>
        buf => buf.Append('\'').Append(val).Append('\'');

    public Eval Property(string propertyName) =>
        buf => buf.Append(propertyName);

    public Eval And(Eval lhs, Eval rhs) => Op(lhs, "and", rhs);
    public Eval Or(Eval lhs, Eval rhs) => Op(lhs, "or", rhs);
    public Eval LessThan(Eval lhs, Eval rhs) => Op(lhs, "<", rhs);
    public Eval GreaterThan(Eval lhs, Eval rhs) => Op(lhs, ">", rhs);
    public Eval Equal(Eval lhs, Eval rhs) => Op(lhs, "=", rhs);
    public Eval NotEqual(Eval lhs, Eval rhs) => Op(lhs, "!=", rhs);

    static Eval Op(Eval lhs, string op, Eval rhs) =>
        buf => rhs(lhs(buf.Append('(')).Append(' ').Append(op).Append(' ')).Append(')');
}

This generates a nicely formatted output from a parsed expression which should be semantically equal to the input. This enables writing simple tests of the parsing algorithm like this:

using Xunit;
...
public class QueryParseTests
{
    static PrettyPrinter provider = new PrettyPrinter();

    static string Parse(string input) =>
        provider.Parse(input)(new StringBuilder()).ToString();

    [Fact]
    static void SimpleQueryParse()
    {
        Assert.Equal("(foo = 3)", Parse("foo = 3"));
        Assert.Equal("(foo != 3)", Parse("foo != 3"));
        Assert.Equal("(foo < 3)", Parse("foo < 3"));
        Assert.Equal("(foo > 3)", Parse("foo > 3"));
        Assert.Equal("(foo > '2019-08-08')", Parse("foo > '2019-08-08'"));
        Assert.Equal("('2019-08-08' = foo)", Parse("'2019-08-08' = foo"));
        Assert.Equal("('2019-08-08' = foo)", Parse("(('2019-08-08' = foo))"));
        Assert.Equal("", Parse(""));
    }

    [Fact]
    static void CompoundQueryParse()
    {
        Assert.Equal("((foo = 3) and (20 != bar))",
               Parse("((foo = 3) and (20 != bar))"));
        Assert.Equal("((foo != 3) or (20 < bar))",
               Parse("((foo != 3) or (20 < bar))"));
        Assert.Equal("((foo != '2018-08-08') or ('2018-08-08' < bar))",
               Parse("((foo != '2018-08-08') or ('2018-08-08' < bar))"));
    }

    [Fact]
    static void CompoundFailParse()
    {
        Assert.Throws<ArgumentException>(() => Parse("((foo = 3) and (20 != bar)"));
        Assert.Throws<ArgumentException>(() => Parse("((foo = 3) not 20 != bar))"));
    }

    [Fact]
    static void SimpleFailParse()
    {
        Assert.Throws<ArgumentException>(() => Parse("foo = 3)"));
        Assert.Throws<ArgumentException>(() => Parse("(foo = 3"));
        Assert.Throws<ArgumentException>(() => Parse("(foo bar 3)"));
    }

    ...
}

I can even test the Datastore query implementation in a similar way, since the implementation above carries all of the post-processing clauses that must be satisfied. For example:

public class QueryTests
{
    static DatastoreQueries provider = new DatastoreQueries();

    static Term Parse(string input) =>
        provider.Parse(input);

    [Fact]
    static void TestComplex()
    {
        var e = new[]
        {
            new Entity { ["Date"] = new DateTime(2016, 05, 01, 0, 0, 0, DateTimeKind.Utc), ["Distance"] = 20.0 },
            new Entity { ["Date"] = new DateTime(2016, 05, 01, 0, 0, 0, DateTimeKind.Utc), ["Distance"] = 23.0 },
            new Entity { ["Date"] = new DateTime(2016, 04, 01, 0, 0, 0, DateTimeKind.Utc), ["Distance"] = 23.0 },
            new Entity { ["Date"] = new DateTime(2016, 04, 01, 0, 0, 0, DateTimeKind.Utc), ["Distance"] = 20.0 },
            new Entity { ["Date"] = new DateTime(2016, 05, 01, 0, 0, 0, DateTimeKind.Utc), ["Distance"] = 2.0 },
            new Entity { ["Date"] = new DateTime(2016, 04, 30, 0, 0, 0, DateTimeKind.Utc), ["Distance"] = 4.0 },
        };
        var t0 = Parse("(Date = '2016-05-01') AND ((Distance > 20) OR (Distance < 10))");
        Assert.Equal(2, t0.Filters.Count());
        var matches = t0.Filters.SelectMany(x => e.Where(x.Post)).Distinct().ToList();
        Assert.Equal(2, matches.Count);
        Assert.Contains(matches, x => 23 == x["Distance"].DoubleValue);
        Assert.Contains(matches, x => 2 == x["Distance"].DoubleValue);

        var t1 = Parse("((Date != '2016-04-01') or (Date = '2016-05-01')) AND ((Distance > 20) OR (Distance < 10))");
        Assert.Equal(6, t1.Filters.Count());
        var match1 = t1.Filters.SelectMany(x => e.Where(x.Post)).Distinct().ToList();
        Assert.Equal(3, match1.Count);
        Assert.Contains(match1, x => 23 == x["Distance"].DoubleValue);
        Assert.Contains(match1, x => 2 == x["Distance"].DoubleValue);
        Assert.Contains(match1, x => 4 == x["Distance"].DoubleValue);
    }

    ...
}

Final Words

Apologies for any typos or if something appears to be missing from the above, these are slightly adjusted code fragments from a larger operational project. Let me know if anything needs clarification!

My intent here is to illustrate the value of applying some ideas from academia to real-world programming challenges that crop up in our daily work. Object algebras/tagless final interpreters are a powerful idea for organizing powerful, reusable abstractions. In particular, it let me design, build and test individual parts of the querying API, which ensured that they would all work when combined into a larger system. Hopefully you too can get some use out of this!

Sasa v1.0.0-RC1

2019-01-26T16:45:00.000-05:00

I've finally updated Sasa to target .NET standard 1.3, removed a lot of superfluous abstractions, and rewrote some of the existing API to target third-party packages that are better supported. This may be the last stable release of Sasa, since I'm shifting. Fortunately, the remaining Sasa features are very stable so there's not much need to evolve them further.

You can find the full API documentation for v1.0.0 here, and obviously you can find Sasa itself via nuget. Here's the current breakdown of features:

Sasa Assembly

The core Sasa assembly contains extensions on standard .NET types, and a few new and useful abstractions that are typical for most programs:

Sasa.FilePath: structured file system path handling, including jail, ensuring combined paths don't escape a root path.
Sasa.Func: extensions to create delegates on methods and for operators with no restrictions, unlike the standard .NET delegate APIs. For instance, open instance delegates to virtual methods are not permitted, but Sasa.Func handles this case transparently.
Sasa.Enums: efficient, typed enum operations, contrasted against the untyped System.Enum API.
Sasa.Operators: exposes any type's operators as typed delegates.
Sasa.Values: numerous useful extension methods on standard types, like formatting streams of values, line/word wrapping, sequence generators, string splitting, etc.
Sasa.VFunc/VAction: typical open instance delegates take a reference type as a first parameter, but value types aren't reference types, so they actually accept a pointer/ref to the struct. Sasa.VFunc/VAction are the struct-counterparts of System.Func/Action.
Sasa.Emit.Operators: extensions to simplify working with operators during code generation.
Sasa.Either: .NET now includes tuples and value tuples, but no sum/variant type. This is satisfied by Sasa.Either.
Sasa.Result: encapsulates either a successfully computed value, or an exception.

Sasa.Binary Assembly

This package provides convenient abstraction for working with binary data.

Sasa.Binary.Bits: low-level bit shifting, folding, counting ops.
Sasa.Binary.Endian: convenient endianness conversions.
Sasa.Binary.UnionXX: types that simplify converting between various types of the same size.
Sasa.Binary.IO.BinaryReader/Writer: portable binary reader/writers.

Sasa.Collections Assembly

This package provides a number of very efficient immutable collections, and some extensions on standard .NET collection types.

Sasa.Collections.Arrays: extension methods on arrays, to immutably insert, duplicate, append, etc.
Sasa.Collections.Fifo: an immutable first-in-first-out queue data structure.
Sasa.Collections.FingerTree: an immutable finger-tree.
Sasa.Collections.Lifo: an immutable last-in-first-out stack data structure.
Sasa.Collections.Vector: the fastest immutable vector for .NET.
Sasa.Collections.Trie: the fastest immutable dictionary for .NET.

Sasa.Concurrency Assembly

This packages provides a few abstractions for writing concurrent programs.

Sasa.Atomics: convenient atomic operations, including atomic and lock-free read/write, load-linked/store-conditional, etc.
Sasa.LLSC: a convenient abstraction encapsulating a load-linked/store-conditional variable.
Sasa.RWLock: the most lightweight read/write lock available for .NET. Only 4 bytes!

Sasa.Linq Assembly

This packages provides sophisticated extensions on IEnumerable.

Sasa.Linq.Enumerables: extensions to compute the difference between two IEnumerable streams.
Sasa.Linq.Change: describes a difference between two enumerable streams at a given position.
Sasa.Linq.ChangeType: the type of difference.

Sasa.Linq.Expressions Assembly

This package provides some abstractions for dealing with LINQ expressions.

Sasa.Linq.Expressions.Expression: a lightweight typed wrapper around System.Linq.Expressions.Expression.
Sasa.Linq.Expressions.Parameter: a lightweight wrapper around System.Linq.Expressions.ParameterExpression.
Sasa.Linq.Expressions.ExpressionVisitor: base class for implementing a basic visitor over LINQ expressions.
Sasa.Linq.Expressions.IdentityVisitor: an implementation of ExpressionVisitor that by default simply returns the same expression. Useful if you only want to transform only a subset of expressions.
Sasa.Linq.Expressions.ErrorVisitor: an implementation of ExpressionVisitor that by default simply throws an error. Useful if you want to ensure you handle all expressions.

Sasa.Mime Assembly

This package makes handling media types and file extension associations convenient.

Sasa.Net Assembly

This package exposes abstractions for network protocols and mail handling.

Sasa.Net.Texty: base interface for text-based protocols.
Sasa.Net.Pop3.*: abstractions implementing the basic POP3 protocol client.
Sasa.Net.Mail.*: extensions on MimeKit's mail abstractions.

Sasa.Numerics Assembly

This package provides a few convenient numerical and statistical calculations.

Sasa.Parsing Assembly

This package provides a few abstractions to creating lexers and parsers via embedded DSLs.

Sasa.Parsing.Lexing.*: abstractions for lexing.
Sasa.Parsing.Pratt.*: abstractions for easy, extensible and declarative Pratt parsing.

Sasa.Web Assembly

This package provides some convenient extensions for internet programs.

Sasa.Web.Uris: efficient URI encoding, decoding and URI query parameter extraction.
Sasa.Web.Url64: a base64 encoding that's safe to use in URIs. Much more compact than typical encodings.

SasaMetal Package

This package provides a more useful equivalent to sqlmetal for LINQ 2 SQL. It enables you to map columns and associations to typed properties, including using enums.

Deprecations

The final version of Sasa v1.0.0 will probably remove types or methods which are currently marked obsolete because there are now better alternatives:

Sasa.Net: Sasa now uses MailKit for all of its parsing methods, so you're better off using that package directly. Some of the extension methods may remain.
Sasa.T, Sasa.Pair, Sasa.Triple, Sasa.Quad: the new System.ValueType package has better integration with languages and the runtime.

Previous version of Sasa had a few more experimental and now unnecessary packages:

Sasa.Arrow
Sasa.FP
Sasa.Partial
Sasa.IoC
Sasa.Contracts
Sasa.Dynamics: see my package Dynamics.NET for further work along these lines.
Sasa.Data: LINQ to SQL is not supported on .NET standard, so you will have to use the old package if you still need this.

RazorInterfaces: interfaces for the standard HTTP methods

2018-10-06T10:28:00.000-04:00

In playing around with Razor Pages, I was irritated again that Microsoft couldn't just standardize a set of interfaces for their methods so that they wouldn't need so much magic.

So I just quickly hacked up RazorInterfaces, available also on nuget.org. It's probably most useful to people just learning Razor Pages, since it requires you to correctly implement the common HTTP methods you'll need to get things running:

public class CustomerPage : PageModel
                          , IPageGet
                          , IPagePostAsync<Customer>
{
 public IActionResult OnGet()
 {
  return Page();
 }

 public async Task<IActionResult> OnPostAsync(Customer customer)
 {
  await DataLayer.Update(customer);
  return Page();
 }
}

There are interfaces for all of the standard HTTP methods: GET, POST, PUT, DELETE, HEAD, OPTIONS. The interface names conform to the following format: IPage[HTTP method] and IPage[HTTP method]Async for the async variant, and there are variants also accepting a single type parameter, IPage[HTTP method]<T> and IPage[HTTP method]Async<T>.

Minimal ASP.NET Core Dependencies for Google Cloud

2018-08-12T19:04:00.001-04:00

After a long hiatus, I'm slowly starting back on blogging and some personal projects. To get me back in the swing of things, here's a short post on running ASP.NET core on Google Cloud, since this seems poorly documented online.

The default project created by the Google Cloud tools includes Microsoft.AspNetCore.All which is a huge dependency. If you want something more minimal:

uninstall-package Microsoft.AspNetCore.All
install-package Microsoft.AspNetCore
install-package Microsoft.AspNetCore.Mvc.Core
install-package Microsoft.AspNetCore.Mvc

This creates a runnable project using the standard Google Cloud Web API project template, although it still isn't ideal as it includes the Razor pages dependencies.

Investigative Capabilities in a Digital World - My Responses

2016-12-02T12:32:00.000-05:00

A thread on reddit pointed to Canada's consultation on what sort of legislative framework they should create to facilitate investigations in the digital world.

Many of the questions are leading and borderline deceptive in their phrasing, but if we're being charitable, this is a good opportunity to inform policy makers. To that end, these are my responses:

How can the Government address challenges to law enforcement and national security investigations posed by the evolving technological landscape in a manner that is consistent with Canadian values, including respect for privacy, provision of security and the protection of economic interests?

Evidence-based policy suggest you first demonstrate the actual need for such respective legislative changes, and further, that the proposed changes actually solve the problems they are intended to solve. Impartial third-party studies should be required for any proposed measures. As a computer scientist with some expertise in computer security, legislators simply do not have the requisite understanding to judge the need or the effectiveness of any proposed measures.

The greatest dangers to Canadians are automated software systems operating on behalf of criminals looking to compromise servers and personal devices. The most effective measures to protect Canadians would thus secure their devices against any such intrusions. We enjoy physical security due to law enforcement patrols to prevent crime, the justice system to punish, reform and deter criminals, and a good public education system that empowers people to provide for themselves such that they do not need to resort to crime.

No truly effective law enforcement or justice system is possible for online activities, given criminal activity affecting Canadians can originate from anywhere on Earth. The Internet is truly the wild west, and like the wild west, safety for all Canadians can only be achieved by empowering and educating individuals in how to secure their own devices.

On a legislative level, a modicum of regulation on the software industry to use only secure software systems when dealing with personal information, and to require companies to develop software using safer, more secure techniques, such as using safer programming languages, would significantly reduce all Canadians' vulnerabilities to online criminals.

In the physical world, if the police obtain a search warrant from a judge to enter your home to conduct an investigation, they are authorized to access your home. Should investigative agencies operate any differently in the digital world?

Why should they? In each case, our charter guarantees the security of the person against unreasonable search and seizure. Law enforcement is *supposed* to be inconvenienced if they want to violate someone's rights. That's the whole point of the charter, to safeguard against abuse of power.

Currently, investigative agencies have tools in the digital world similar to those in the physical world. As this document shows, there is concern that these tools may not be as effective in the digital world as in the physical world. Should the Government update these tools to better support digital/online investigations?

Of course, authorized use of such tools ought to be as effective as possible, so long as "these tools" do not necessitate compromising citizens' everyday software in order to be effective. For instance, these tools cannot require backdoors into people's software in order to be effective. The tools used by law enforcement *will* leak into the criminal world, and you would place every citizens' bank accounts and private life in jeopardy if you require backdoors.

Is your expectation of privacy different in the digital world than in the physical world?

Of course. Physical security and digital security are necessarily different because physical objects and digital objects have completely different properties. For example, it's trivial to digitally copy an item, it's not easy to copy a physical item. It's thus impossible to enforce the same sorts of property rights for digital items as we enjoy for physical items. You can securely share a private physical photo without too much concern that it will be disseminated globally, but as many people have learned, digital photos are easily leaked and spread online.

Since the Spencer decision, police and national security agencies have had difficulty obtaining BSI in a timely and efficient manner. This has limited their ability to carry out their mandates, including law enforcement's investigation of crimes. If the Government developed legislation to respond to this problem, under what circumstances should BSI (such as name, address, telephone number and email address) be available to these agencies? For example, some circumstances may include, but are not limited to: emergency circumstances, to help find a missing person, if there is suspicion of a crime, to further an investigative lead, etc…

In which *specific* cases has the absence hampered a time-sensitive investigation? I hear this "fact" repeated constantly, but no specifics are ever provided. If no compelling case exists, or alternate investigative methods can and have yielded the same results, then rapid subscriber access is an unnecessary infringement on our rights and freedoms, and frankly, a waste of legislators' time.

If such specifics are too sensitive to disclose, then they can be analyzed by an impartial, thirdparty investigation consisting of computer experts to determine their validity. I know of no such cases, nor any such investigation. If such an investigation justifies rapid subscriber access, then rather than infringing our rights and eliminating all oversight, establish a court that's dedicated solely to rapid processing of such applications, and cycle the judges sitting on these courts every year to prevent overly favourable relationships with those requesting warrants. No blank cheques, and any such requests must still go through a court.

Finally, subscriber information does not identify a responsible party, so relying on this will generate many false leads and waste investigative resources in time-sensitive investigations. For instance, today's deplorable software security makes it fairly easy to hijack someone else's wifi signal, so the false belief that an IP address that participated in an illegal activity will lead investigators to waste time pursuing the subscriber who was assigned that IP address. Law enforcement needs to be trained to understand these realities.

Do you consider your basic identifying information identified through BSI (such as name, home address, phone number and email address) to be as private as the contents of your emails? your personal diary? your financial records? your medical records? Why or why not?

Of course it's not as private. My kitchen is also less private than my bedroom, but law enforcement still requires a warrant to access it.

Do you see a difference between the police having access to your name, home address and phone number, and the police having access to your Internet address, such as your IP address or email address?

Internet address, home address, phone number and email address are all logical addresses via which I can be personally reached in some way, and behind which I hold significant personal and private information. My name is different from all of the above: it is not unique, and it does not provide a way to reach me personally.

The Government has made previous attempts to enact interception capability legislation. This legislation would have required domestic communications service providers to create and maintain networks that would be technically capable of intercepting communications if a court order authorized the interception. These legislative proposals were controversial with Canadians. Some were concerned about privacy intrusions. As well, the Canadian communications industry was concerned about how such laws might affect it.

As a computer scientist, I can tell you right now that reliable and complete intercept capability is impossible. If you legislate any such requirements, then those wishing to hide their activities have any number of mechanisms available to *undetectably* circumvent it. And not just via encryption, see steganography, for instance.

Given this impossibility, and the fact that IP addresses cannot identify the responsible party, but merely a means by which a criminal activity took place, interception capabilities are likely an unnecessary waste of resources that would yield too many false positives, and resources are probably better spent on more reliable and effective investigative techniques.

Furthermore, the existence of such networks makes them ripe targets for foreign governments and criminals. The potential for abuse, and the vulnerabilities created, are simply too great, even if well-intentioned.

Should Canada's laws help to ensure that consistent interception capabilities are available through domestic communications service provider networks when a court order authorizing interception is granted by the courts?

Yes, but I still question the effectiveness of such measures, and the cost-benefit ratio of creating this infrastructure.

If the Government were to consider options to address the challenges encryption poses in law enforcement and national security investigations, in what circumstances, if any, should investigators have the ability to compel individuals or companies to assist with decryption?

Never. The most significant danger to Canadians is online criminal activity. Software that compromises people's devices to leak personal information, for identity theft among other criminal purposes, is already being traded online. Any attempt to weaken or circumvent encryption puts every Canadian in danger for very little gain considering messages exchanged between criminals can be hidden via other undetectable means (see discussion of Intercept Capability above).

Canadians as a whole are better served by *requiring* devices to implement proper encryption that cannot be circumvented. Any security researcher will tell you exactly the same thing: securing an encrypted device with any backdoor is impossible. You cannot compromise every Canadian's entire life just to assist a small number of investigations.

How can law enforcement and national security agencies reduce the effectiveness of encryption for individuals and organizations involved in crime or threats to the security of Canada, yet not limit the beneficial uses of encryption by those not involved in illegal activities?

They literally cannot. It's impossible. Any computer scientist who has studied security will tell you the same. Even the best design I have seen to date, which holds decryption keys in hardware not accessible to software running on the device, and which looks pretty safe at first glance, is rife with vulnerabilities when you consider that all of our hardware is built overseas under the purview of foreign governments and in manufacturing plants with completely unknown on-premise security measures. Hardware-level security breaches have already occurred even though there's plenty of other lower-hanging fruit.

Should the law require Canadian service providers to keep telecommunications data for a certain period to ensure that it is available if law enforcement and national security agencies need it for their investigations and a court authorizes access?

Retaining information is fine, so long as that data is required to be appropriately secured from access without a warrant. In particular, telecommunication providers should not be permitted to sell this information, and should be required to conduct third-party audits of the security practices used to archive this data.

Even so, the utility of data retention will probably decrease in the coming years as more services move to the internet, and more information exchanged online becomes encrypted. Already, people can switch most of their telephony services to online voice chat services, thus reducing the effectiveness of telephony data retention.

If the Government of Canada were to enact a general data retention requirement, what type of data should be included or excluded? How long should this information be kept?

I can't imagine a use for data more than 5 years old. Retaining all Internet traffic is impossible, but this is what would be needed for truly effective analysis.

However, full retention is an unacceptable security risk. Even enacting a private key encrypted infrastructure, whereby all such data were stored in encrypted form and the decryption keys were held by the courts to be used only when authorized, securing these keys against any possible disclosure is probably impossibly difficult.

To balance the security risks against realizable data capture abilities, retaining only metadata for 5 years and encrypting it all using the above public key infrastructure would be acceptable.

Editorial Note: I have little expectation that such public key protections would be employed though, and without them, I'm not in favour of any data retention at all.

Semantic Tab Panels via HTML Tables

2016-11-27T15:55:00.000-05:00

There are plenty of tutorials online for creating tabbed panels in HTML documents, some using JavaScript, some using pure CSS tricks. Most of the approaches seem to assume that a list of some type is the appropriate element to use for a tabbed interface, but I'd argue that lists are not semantically descriptive of tabbed interfaces. They either include the main tab content along with the tab name, like so:

<ul class="tabs">
    <li class="tab-active"><a href="">Active</a> Active tab content goes here</li>
    <li><a href="">Second</a> Inactive second tab content goes here</li>
    <li><a href="">Third</a></li>
</ul>

Or the tab content has no semantic relationship with the tabs that control their display, like:

<ul class="tabs">
    <li class="tab-active"><a href="#first">Active</a></li>
    <li><a href="#second">Second</a></li>
    <li><a href="#third">Third</a></li>
</ul>
<div class="tab-body">
    <div id="first">Active tab content</div>
    <div id="second">Second tab content</div>
    <div id="third">Third tab content</div>
</div>

It's simply bizarre to mix the tab name with the main tab content as with the first case. The tab name controls interaction with the tab content, it's not part of the tab's content. They are semantically distinct, and so should be distinct in the markup.

In the second case, name and content are separated, but not in a semantically meaningful way, which makes it invisible to screen readers unless you add extra annotations. The connection between tab name and tab content is completely indirect since it depends upon interpreting the meaning of a link's target as referring to an element in the current document. This is obviously only one approach to separating tab name from content, but every approach I've seen so far makes a similar tradeoff: no technique expresses the relationship between tab name and tab content using markup that properly describes their relationship in a semantically meaningful way.

So what is the semantic relationship between tab name and tab content? Well clearly the tab name is a heading of some kind that summarizes and/or meaningfully describes the content it references.

Secondly, a set of tabs are logically related in some larger context, and so markup that also expresses a grouping is the most semantically meaningful. For instance, expressing tab content as a set of divs doesn't express this association, nor does expressing tab names as standard headings H1..H6.

However, there is an HTML element that provides headings and semantically associates those headings with its contents: tables!

<table class="tabs">
    <thead><tr>
        <th class="active" id="first">Active</th>
        <th id="Second">Second</th>
        <th id="Third">Third</th></tr>
    </thead>
    <tbody>
        <tr>
            <td headers="first">Active tab content</td>
            <td headers="second">Second tab content</td>
            <td headers="third">Third tab content</td>
        </tr>
    </tbody>
</table>

The table headings are the tab names, and the table cells contain its contents. Each cell designates the header/tab name to which it belongs, which is meaningful semantic markup for tabbed interfaces. Now all that remains is providing the interactive nature of tabbed interfaces. I think most of the standard techniques will work just as well here, but I used a simple javascript function and some basic CSS you can see in this jsfiddle. Basically, the CSS sets the backgrounds, borders and mouseover behaviour needed to mimic the familiar tab layout:

body {background-color:white;}
table.tabs{border:none; border-collapse:collapse}
table.tabs>thead>tr>th{text-align: left;padding:15px;position:relative;bottom:-1px;background-color:white}
table.tabs>thead>tr>th:hover{background-color:lightgrey; cursor:pointer}
table.tabs>thead>tr>th.active:hover{background-color:white}
table.tabs>thead{border-bottom:1px solid}
table.tabs>thead>tr>th.active{border:1px solid;border-bottom:none}

Then a simple javascript function run at document load progressively enhances the table to act like a tabbed interface:

<script type="text/javascript">
    function regTabs() {
        [].forEach.call(document.querySelectorAll('table.tabs td[headers]'), function (x) {
            var t = x.parentElement.parentElement; // table or tbody tracks active tab
            var th = document.getElementById(x.attributes.headers.value);
            if (th != null) {
                x.th = th;
                th.style.float = 'left';
                th.addEventListener('click', function() {
                    t.activeTab.style.display = 'none';
                    t.activeTab.th.classList.remove('active');
                    x.style.display = '';
                    th.classList.add('active');
                    t.activeTab = x;
                });
            }
            if (th == null || !th.classList.contains('active'))
                x.style.display = 'none';
            else
                t.activeTab = x;
        });
    }
    regTabs();
</script>

Basically, the headers are all floated left and the the active tab content is tracked as a property of the table. The headings capture the click event and modify the display styling property of the tab content and the active tab name/heading.

I'm no CSS or javascript wizard, so this can all probably be improved, but it suffices to make the point that mimicking tabbed interfaces using tables is a good semantic fit.

Given the use of DomElement.classList, I believe this requires IE10 or up, but it certainly could be ported to older browsers without too much difficulty.

Edit: a redditor pointed out that description lists, dl/dt/dd, are also appropriate semantic markup for tabbed panels. These would indeed work almost as well as tables, provided you don't try to use multiple dt elements for any given dd. I've created a jsfiddle demonstrating semantic tabs using description lists, but it requires strange, non-semantic arrangement of the markup to work easily.

Edit: here's a better semantic version using the <nav> tag for the tabs, since tabs are semantically a type of navigation within the current document. Each nav link directly references the id of the corresponding tabbed content. All you need to do is decorate the outer block with "tabbed" to get the dynamic tab interface.

Versioning Domain Entities with Ordinary SQL

2015-10-18T14:25:00.002-04:00

In any system with mutable entities saved in an object store, it's inevitable that you'll want to audit the changes made to a particular entity. Explicitly versioning your entities solves this problem, and it's often pretty useful to end users of a sophisticated application, say for undo purposes. However, it's not always so clear how to transform a mutable entity into an versioned entity.

There are also a number of complications to versioning mutable objects:

Handling parent-child relations such that a child change does not propagate to every parent
Supporting efficient queries on history
Supporting efficient queries for the newest version
Space complexity of data representation
Concurrent changes

Some of the above properties also have a natural tension, ie. supporting efficient queries often require storing more data so their space complexity is worse. I'm going to provide an overview here of a very simple schema for versioned entities with the following properties:

Queries on history are just as efficient as queries on current data
Queries on history have the same form as queries on current data, ie. no separate history table that must be handled specially -- in fact, you can use the exact same queries for both history and current data, thus eliminating all query duplication!
Parents and children can evolve independently, ie. data update is "strictly disjoint access parallel" in transactional concurrency parlance: as long as two updates don't update the same object, they will both commit successfully

Schema

The schema changes required are quite straightforward for any entity with a primary key (which should be all of them!):

Add a version number column of type int or long: this is a monotonically increasing version number, where version=0 is used when the object is first created, version=1 is after the first change is committed, etc.
Ensure the PK index orders the version number in DESCENDING order, because the highest version number is the most current version
Change the primary key to be composite over the existing PK and the new version number -- the original PK is now called the "logical id"
Add a revisionDate column storing the date the version was committed
Add an index over the logical id and the revisionDate, again with revisionDate in DESCENDING order

That's it! The original entity's PK is now a logical id rather than an actual instance id, so accessing a particular version of an entity now requires both the logical id and the version to access. The latest version is simply the entry with the largest version number.

Queries

Supposing we have an entity Foo whose PK is called "id", and :columnName is the syntax for a named query parameter. Ordinary queries on entities now become the following queries on versioned entities:

Insert a new entity:

INSERT INTO Foo(id, ..., version, revDate)
VALUES (:id, ..., 0, GETDATE())

Get the most current version:

SELECT TOP 1 * FROM Foo
WHERE id=:id
ORDER BY version DESC

Get a version before a specific date/time:

SELECT TOP 1 * FROM Foo
WHERE id=:id AND revisionDate <= :rev
ORDER BY version DESC

Update an entity:

INSERT INTO Foo(id, ..., version, revDate)
VALUES (:id, ..., :cur_version_plus_1, GETDATE())

Updates on entities are now inserts, so the entities at the data store level are now effectively immutable and the entity table is an append-only log of all changes to those entity types. Since the indexes are ordered so that the latest version is always first, the impact on accessing the most current version should be minimal. Accessing older versions are similarly easy, consisting of a query on the revisionDate column.

We use the revisionDate and not the version number when querying over history to easily support queries over more than one entity. Since each entity evolves its version independent of all others, the version numbers of two connected entities, say Foo[id=1] and Foo[id=2], are completely unrelated. The former might have version=999, and the latter version=1. The change that caused Foo[id=1] to increment its version from 998 to 999, could have also been the one to change Foo[id=2] from version 0 to 1. So if you look at Foo[id=1] at version=998, the only way to see the correct version of Foo[id=2] is to return the first version whose revisionDate is <= the Foo[id=1].revisionDate for version=998.

Fortunately, this makes the queries extremely simple, since a complicated query for a whole set of interconnected entities needs only a single date parameter. In other words, you can compose two independent queries using the same revisionDate parameter name via simple concatenation into a multiquery, and it just works.

Furthermore, your UI then only needs to pass around an optional revisionDate parameter to switch between queries on the latest data and queries on history. In fact, switching the query isn't even needed, because you can use the same query for both!

SELECT TOP 1 * FROM Foo
WHERE id=:id AND (:rev IS NULL OR revisionDate <= :rev)
ORDER BY version DESC

As promised, this query unifies the queries for the most current version, and any historical version of an entity. If the :rev parameter is set to NULL, then the most recent version is returned. The query optimizer can easily eliminate the redundant WHERE clause, so there should be no performance impact. If :rev is present, the redundant NULL check should be eliminated by the query optimizer and the last change before the given date/time would be returned. Alternately, you can simply always use the query with the revisionDate and simply pass in DateTime.Now, which will always return the latest result.

Concurrency

Being strictly disjoint access parallel, this schema on mutable entities is pretty much as good as it gets concurrency-wise. The entity table is essentially an append-only log of changes, and the new composite primary key ensures that any entity can undergo only one change event at a time. This ensures that any invariants in the application's domain model are preserved once committed to the entity store.

This is also the ideal optimistic concurrency model for any CRUD application.

Space Complexity

The space complexity is obviously sub-optimal since every version is stored. However, managing this space is a breeze, since you can easily write a query to delete all entity history before the current version, or keep the N latest versions, or basically any storage policy that makes sense for your business.

One serious concern are "large" columns in an entity, say an unbounded text column, or binary data. In this case, you should perform an additional refactoring of your schema to model git's way of doing things: add a Blob table that holds the data indexed by its hash, then just store the hash in the entity table. This keeps it compact, and ensures you only ever store one version of a binary blob, which you should probably be doing anyway.

Variations

There are a number of variations on this schema, like normalizing the revision date into a separate Revisions table to which you can add a userId to blame for the change, etc. But then the Revision table becomes a central bottleneck to insert performance and updates are no longer strictly disjoint access parallel. Furthermore, all select queries must now join on this Revisions table to find the revision date. I think the schema I've laid out provides the optimal performance tradeoff with little enough space overhead.

Versioned ORM

Standardizing this schema as the default operating mode for an ORM would be ideal. This sort of ubiquitous schema management should be automatic, and now that ORMs are moving to code-first designs where they manage the whole schema anyway, this should be a snap right?

The API interacting with the ORM remains largely unchanged, but all change submissions transparently become inserts, and all select queries automatically have the universal query form described above. A query on history would then simply be a query that's started with a date or version parameter, ie. instead of context.Foo.Where(...), it would be something like context.Foo.Before(DateTime?).Where(...).

Summary

I'm not the first to describe this schema for versioning, but I haven't seen a description quite so comprehensive, detailing the queries and its concurrency properties.

These days most of the buzz is around CQRS and event sourcing. This schema has some connections to event sourcing, since entity tables are now append-only logs with the entire history remaining intact. However, the schema of the entity change events aren't domain events as they are in event sourcing.

CRUD applications don't get much attention anymore, but I agree with Martin Fowler that most simple domains are still better served by a CRUD model than by CQRS, at least given our current tools. Domain entities usually define proper concurrency boundaries (and if they don't, your domain model is probably wrong!), so this schema is a natural fit that still ensures a domain model can naively enforce its invariants in ordinary code, and if those changes succeed then those invariants are preserved in the entity store.

Efficient Curried Application

2015-10-04T12:28:00.003-04:00

I just wanted to jot down some thoughts on compiling eval/apply curried application using an efficient register-based calling convention. This translation is actually quite simple once you get the trick. Leroy's Zinc abstract machine showed the way: evaluate arguments right-to-left instead of the typical left-to-right. The ZAM wrote these arguments to a stack, but in our case we want them to go to the register file first.

So we evaluate arguments right-to-left and write them into registers left-to-right, ie. argN goes in reg0, argN-1 goes to reg 1, ... arg0 goes to regN. At some threshold, we'll run out of registers. If your program is in direct-style with a stack, this threshold is often defined by the callee-save registers of the native calling convention. If your program is in CPS form, then you can use the entire register file.

Once you run out of registers, you have to spill the remaining arguments to the stack. However, the stack is simply a register pointing to some base address from which all other values are accessed by an offset from that base. But so is the closure record! If you lay out your closure record in memory so it has the same layout as the stack frame would, then you can just treat it like a small stack. This can save many memory access operations moving data between closures and registers.

Suppose our register spill threshold is 2 registers (reg0 and reg1), with a third register left for the remaining arguments (reg2), and consider the following C-style function:

int foo(int x, void *y, int z);

Here we have three arguments, one of which will need to be spilled to the stack since we can only allocate two when calling foo. Arguments z and y will go in registers, while argument x will go on the stack. However, the compilation of foo will require a certain layout in memory for optimal efficiency as well (assuming 32-bit pointers):

foo_papp2:
  loadi reg0, reg2 // reg0 = *reg2
  sub1 reg2, 4     // reg2 += sizeof(int)
foo_papp1:
  loadi reg1, reg2 // reg1 = *reg2
  sub1 reg2, 4     // reg2 += sizeof(void*)
foo:
  ...              // foo: reg0=arg2, reg1=arg1, reg2=&struct{arg0}

Hopefully the pseudo-assembly is clear. Function foo gets "prologues", one per argument which is used for partial applications. Each prologue loads an argument from the "stack"/closure record into the registers and decrements the pointer so it points to the next argument. By the time all the registers are used up, you should have exactly the "stack frame" that you need left, and the registers-passed arguments are all in their proper place.

For instance, when you partially apply foo to 1 argument, the function pointer value of the closure record is actually pointing to foo_papp1, and when you partially apply with two arguments, it's pointing to foo_papp2. This pattern is the same for any number of arguments, and the code will all have the same size for each argument, which we will need later when building partially applied closures.

Now consider the following curried applications of foo:

int(void*,int) foo_1 = foo(0);
int(int) foo_2 = foo(0, somePointer);

Hopefully the pseudo C-style closure syntax is clear, ie. int(char) is a closure taking a char and returning an int. The above are two different partial applications, foo_1 which applies only the first argument, and foo_2 which partially applies the first and second arguments. These are what the closures will look like in memory:

foo_1 →

function = &foo_papp1

arity = 2

applied = 1

x = 0

foo_2 →

function = &foo_papp2

arity = 1

applied = 2

y = somePointer

x = 0

Now when you try to apply values foo_1 and foo_2 to the remaining arguments, this actually performs a dynamic check to see how many arguments are required. For instance, fully applying a closure with the signature int(int, int) elaborates to the following pseudo-code:

obj_t returnValue;
RETRY:
switch (closure→arity) {
case 0:
  closure = closure→function();
  goto RETRY;
case 2:
  returnValue = closure→function(arg0, arg1);
  // reg0=arg1, reg1=arg0, reg2=&closure->arity + 1
  break;
case 1:
  closure = closure→function(arg0);
  returnValue = closure→function(arg1);
  break;
}

You have to dynamically check if you're applying all remaining arguments. I handle the case of arity=0 to show that intermediate closures do exist, but you don't typically need that specific case.

When you're performing a partial application, you also have to check the arity and build intermediate closures referencing the proper function address, which changes depending on the arity. Given a closure and performing a partial application, you have to make adjustments to point to the new function pointer. So given a 3 argument closure to which you're applying n < 3 arguments:

closure nclo;
obj_t returnValue;
if (closure→arity == closure→applied + n) {
  returnValue = closure->function(arg0, ..., argn);
} else {
  nclo = malloc(2 * sizeof(void*) + closure→arity - closure→applied + n * sizeof(void*));
  nclo→arity = closure→arity - n;
  nclo→applied = closure→applied + n;
  nclo→function = closure→function - SIZE_OF_FUNCTION_PROLOGUE;  // compute offset of new entry point
  nclo→arg0 = arg0;
  ...
  nclo→argn = argn;
  memcpy(&nclo->argn + sizeof(void*), &closure→applied + sizeof(void*), closure→applied * sizeof(void*));
}

The most general case to handle is when you're in fully polymorphic code for which you cannot know if applying the closure you have is a partial application. This is a combination of the last two elaborations, but still requires only a simple switch statement to handle all cases.

Now the most boring case: when a function is not a closure and it's being fully applied with all arguments at a call site: well in this case you fill the registers as usual, then you push the remaining arguments on the real program stack, and pass the pointer to that activation frame as the "stack" argument. Simple!

This is only a high-level overview for myself, so no doubt I've skipped some important details, or gotten them wrong while jotting these notes down. Please let me know if you spot something!

Summary

Hopefully that didn't ramble so much that the core idea was obscured. The core idea is that all closures are functions taking a fixed number of arguments defined by the number of caller-save registers. These arguments are passed in registers, and a last register is reserved for a "stack frame" holding the remaining arguments. The closure record can act as this stack frame, so you don't have to copy values between records and the stack on closure application.

Furthermore, each function has a "prologue" generated for each argument laid out prior to the actual function being executed in memory. These prologues are used when applying partially applied functions, and they load any previously applied arguments from the closure record into registers, assuming all the reserved registers aren't already used.

This is basically an implementation of the eval/apply strategy, with a few tweaks to reduce the amount of redundant code generated and reuse as much memory as possible. I don't think there's anything particularly novel, but I haven't seen all the details laid out like this (although I may of course have ommitted some details I consider "obvious", even if they aren't to others). If you have any further suggestions, I'd love to hear it!

Limitations

I don't discuss any of the complexities surrounding unboxed types of non-pointer size, which considerably complicates calling conventions. Most functional languages require a universal representation for this reason, usually of pointer size, which is what I assume here. See my series on Garbage Collection Representations to get a feel for how these values translate to machine code, and how this would impact closure representations.

Algebra.NET: A Simple Algebra eDSL for .NET

2015-08-26T20:11:00.000-04:00

Algebra.NET is a simple library designed to facilitate easy expression and manipulation of algebraic functions. For instance, here's a simple function:

Function<Func<double, double>> a = Algebra.Function(x => 2 * x + 1);

We can compile such a function to efficient IL:

Func<double, double> func = a.Compile("times2plus1");

Or we can apply some algebraic identities to rewrite it:

Identity associative = Algebra.Identity(x => x + 1 == 1 + x);
Identity mulEqAdd = Algebra.Identity(x => 2 * x == x + x);
Console.WriteLine(a);
Console.WriteLine(a.Rewrite(1, associative, mulEqAdd));

// Prints:
// ((2 * x) + 1)
// (1 + (x + x))

Rewrites can sometimes loop forever (consider "x + y == y + x"), so the Rewrite method takes a number indicating the maximum number of iterations to perform all the rewrites.

All the usual arithmetic operations are available, including an extension method for exponentiation:

var f = Algebra.Function(x => x.Pow(3));
Console.WriteLine(x);

// Prints:
// (x ^ (3))

Design

As of this writing, Algebra.NET is a functional example of a simple term rewriting system. Term rewriting is usually pretty awkward to express in an object-oriented language, and I banged my head against the keyboard to figure out a nice way to do it, until I hit on just doing unification (of course!).

So I reused the term language and added an equality operator to generate an identity that conceptually maps one term to another. I then perform unification on the left hand side, and generate a set of substitutions to transform the matching term into the right hand side of the identity.

It was ultimately quite simple, consisting of 3 methods on Term:

Term Rewrite(Identity e, Term[] bindings)
bool TryUnify(Term e, Term[] bindings)
Term Subsitute(Term[] bindings)

Rewrite tries to recursively unify the Identity's left hand side with the current term using TryUnify. On success, the 'bindings' array will have been populated by TryUnify with the substitutions to perform, so it substitutes the bindings into the identity's right hand side to generate the new term.

There are only 3 term types: constants, variables and binary operations. Negation is handled as a binary operation "0 - x" for simplicity. The unification methods on each of the term types are only a few lines of code each, but are quite powerful!

So if you want to understand expression compilation to CIL, unification, or term rewriting, this is pretty much as simple as it gets.

Algebra.NET doesn't perform any term simplification at this point, only term rewriting. Some rewrites may of course be simplifications, but a term like "0 - 3" will not be simplified to "-3".

Future Work

As mentioned, Algebra.NET doesn't perform simplification, so that's a big one. I started developing this to work on symbolic and automatic differentiation for numerical optimization problems. I'm aware of other .NET libraries for this, but I didn't like how clumsy it was to write algebraic expressions, nor did they have nice and extensible facilities for rewriting expressions. So I created this in my spare time and intended to continue fleshing it out as needed.

Sasa v0.17.0 Released

2015-08-10T19:57:00.002-04:00

A few new features in this release, and a few bugfixes and enhancements in MIME parsing. As before, the online docs are available on my site, the full release including experimental assemblies on Sourceforge, and the production assemblies on Nuget. The changelog:

 * sasametal can now replace columns and FK associations with enums that are
   provided separately
 * sasametal can now output an interface file, which generates an interface
   describing the database entities
 * many MIME fixes thanks to a few bug reports on Sasa discussion list
 * added System.Data, which provides useful extensions to ADO.NET and
   IQueryable, like a simple interface for batched SQL queries
 * MIME parser now accepts an optional parsing parameter that controls various
   parsing options, like how strict e-mail address formatting accepted,
   and whether HTML views should be parsed into MailMessage.Body
 * added enum as a primitive to Sasa.Dynamics IReducer and IBuilder
 * fixed Sasa.Enums.IsSequential check

Sasa.Dynamics probably isn't used much, but in my experiments I found it increasingly important to handle enums as a primitive, so type reduction and construction (IReduce and IBuild respectively) now includes it as such.

I created sasametal to ease the pain of maintaining several legacy systems that depend upon Microsoft's Linq2Sql. It first renames some associations to make them more meaningful, since Linq2Sql does a bad job of this in some cases. This made some integrations with ORMs like EntityFramework and NHibernate a little easier, but there still remained some painful points.

One of these pains was handling enums, ie. an entity set whose members are effectively fixed. Sasametal now takes enums into account via a command-line option, /enum:[file path], that accepts a file listing table-to-enum rewrites, one per line. For instance, if an entity Foo had a StatusId column with an FooStatus enum for the status, we could call sasametal like so:

C:\somepath> sasametal ...[other option]... /code:FooDb.cs /enum:enums.txt

and the enums.txt contains merely:

dbo.FooStatuses=Namespace.Statuses.FooStatus

You should provide the full table name including schema that sqlmetal would generate, and provide the fully qualified name of the enum you want to substitute. Sasametal will then eliminate the FooStatus entity and any FK associations linked to that entity, then it will update the type of any properties for that foreign key to use the enum type. The entities generated by sasametal are thus somewhat more efficient, and I've found it easier to use Linq2Sql as-is for quick projects against legacy systems.

When combined with sasametal's other new feature, generating interfaces describing the entities, it also eases the transition to more sophisticated ORMs that already handle enums.

Sasa v0.16.1 Released

2015-06-15T18:11:00.001-04:00

Just a bugfix release, mainly for the MIME parsing. Changelog:

* added support for 8bit transfer encoding, even if not supported by .NET <4.5
* support content types whose prologue is empty
* added support for arbitrarily nested multipart messages
* added alternative to Enum.HasFlag that invokes Sasa's faster enum code

As usual, online docs are available, or the a CHM file is available on Sourceforge.

C# Enums with [Flags]

2015-06-06T08:05:00.000-04:00

I've had numerous posts here describing instances where C# has come so close to getting it right, and yet misses the mark by an inch. The original C# enums have a simple semantics:

enum SomeEnum
{
  First,   // compiler implicitly assigns 0
  Second,  // compiler implicitly assigns 1
  Third,   // compiler implicitly assigns 2
  Fourth,  // compiler implicitly assigns 3
}

This worked nicely as a concise expression of a need for a set of distinct of values, but without caring what they are. C# later introduced the [Flags] attribute, which signals to the compiler that a particular enum isn't actually a set of disjoint values, but a set of bit flags. However, the compiler doesn't actually change its behaviour given this change of semantics. For instance, the following enum is completely unchanged, despite the semantically meaningful change to a set of bitwise flags:

[Flags]
enum SomeEnum
{
  First,   // compiler implicitly assigns 0
  Second,  // compiler implicitly assigns 1
  Third,   // compiler implicitly assigns 2
  Fourth,  // compiler implicitly assigns 3
}

SomeEnum.First is now not a valid flag, and SomeEnum.Fourth is now equivalent to (SomeEnum.Second | SomeEnum.Third). The native enum behaviour is useless as a set of enumerated flags.

I suspect most people would argue that if you're interested in bitwise flags, you generally need to explicitly specify what the flag values ought to be. I don't think this is true. In fact, the only places where this is true is where the flags are defined in an external system, for instance read/write flags when opening file streams.

Exactly the same argument could be levelled against the native enum behaviour too, but like native enums provide a default correct semantics for when you don't care, so flags should provide a default correct semantics for when you don't care.

Sasa v0.16.0 Released

2015-04-13T12:57:00.000-04:00

Mainly a bugfix release, with only a few new features. As always, the docs are available here online, or as a compiled CHM file on sourceforge. Here's the changelog:

 * a few bugfixes to MIME parsing for header and word decoding (Thanks Evan!)
 * added combined SubstringSplit function for more space efficient parsing
 * explicitly parse e-mail addresses for more flexible address separators
 * NonNull now throws an exception if encapsulated value is null, which
   is used in the case when the instance is invalidly constructed (Thanks Mauricio!)
 * build now checks for the presence of the signing key and ignores it if
   it's not present
 * a more efficient Enum.HasFlag using codegen
 * added a new ImmutableAttribute which Sasa's runtime analyses respects
 * FilePath's encapsulated string now exposed as a property

Thanks to Evan/iaiken for a number of bug reports, and to Mauricio for feedback on the NonNull<T> pattern.

Sasa v0.15.0 Released

2015-03-02T00:29:00.001-05:00

This release features a few new conveniences, a few bugfixes and a vector that's faster than any other I've managed to find for .NET. Here's the changelog since the last v0.14.0 release:

 * more elaborate benchmarking and testing procedures
 * added a minor optimization to purely functional queues that caches
   the reversed enqueue list
 * FingerTree was switched to use arrays as opposed to a Node type
 * optimized FingerTree enumerator
 * FingerTree array nodes are now reused whenever possible
 * Sasa.Collections no longer depends on Sasa.Binary
 * MIME message's body encoding is now defaulted to ASCII if none specified
 * added convenient ExceptNull to filter sequences of optional values
 * NonNull<T> no longer requires T to be a class type, which inhibited its
   use in some scenarios
 * fixed null error when Weak<T> improperly constructed
 * added variance annotations on some base interfaces
 * added a super fast Vector<T> to Sasa.Collections
 * sasametal: fixed member name generated for compound keys
 * Sasa's Option<T> now integrates more seamlessly in LINQ queries
   with IEnumerable<T> thanks to two new SelectMany overloads
 * added two new array combinators to extract values without exceptions
 * added a few efficient Enumerables.Interleave extensions to interleave
   the values of two or more sequences
 * decorated many collection methods with purity attributes
 * fixed a bug with the Enums<T>.Flags property

A forthcoming post on the recent optimizations I've applied to Sasa's immutable collections will show the dramatic performance when compared to Microsoft's and F#'s immutable collections. Sasa's vector is an order of magnitude faster than F#'s, and Sasa's trie is actually faster than the standard mutable System.Collections.Generic.Dictionary up to around 100 elements.

As usual, you can find the individual packages on nuget, or download the full release via Sourceforge. The documentation is available in full as a .CHM file from Sourceforge, or you can peruse it online.

µKanren.NET - Featherweight Relational Logic Programming in C#

2015-02-24T17:23:00.003-05:00

The µKanren paper is a nice introduction to a lightweight logic programming language which is a simplification of the miniKanren family of languages. The existing µKanren implementation in C# was a translation from Scheme, and thus is verbose, untyped with lots of casts, and non-idiomatic. I also found most of the other Kanren implementations unnecessarily obscure, heavily relying on native idioms that aren't clear to newcomers.

uKanren.NET provides a clear presentation of the core principles of µKanren using only IEnumerable<T> and lambdas, showing that µKanren's search is fundamentally just a set of combinators for transforming sequences of states. The values of the sequence are sets of bound variables that satisfy a set of equations. For instance, given the following expression:

Kanren.Exists(x => x == 5 | x == 6)

You can read it off as saying there exists an integer value to which we can bind variable x, such that x equals either 5 or 6 [1]. Solving this equation produces this sequence of values:


x[0] = 5,
x[0] = 6,

where each line represents a different value that satisfies the equation. These equations can be arbitrarily nested or chained using | and &, just like ordinary logical expressions, and µKanren will find a solution for all variables, if a solution exists. A further benefit of this implementation, is that you cannot accidentally mix up boolean equality expressions with relational logic expressions, because relational equality, conjunction and disjunction operators all operate on non-boolean types.

Logic Variables, Goals & States

We start with the simple core, which is just a logic variable identified uniquely by an integer, and with a convenient local name:

public sealed class Kanren
{
    internal int id;

    // the variable's public name, extracted from the delegate parameter
    public string Name { get; internal set; }

    public static Goal operator ==(object left, Kanren right)
    {
        return Equals(left, right);
    }

    public static Goal operator ==(Kanren left, object right)
    {
        return Equals(left, right);
    }

    public static Goal operator ==(Kanren left, Kanren right)
    {
        return Equals(left, right);
    }
    ...
}

µKanren.NET variables override the equality operators to generate "Goals". A Goal is represented by a delegate that takes a State and generates a sequence of States:

public delegate IEnumerable<State> Goal(State state);

A State is simply a set of variable bindings that satisfy the equations described by the Goal:

public sealed class State
{
    internal Trie<Var, Kanren> subs;
    internal int next = 0;
    internal Func<Goal> incomplete;

    internal Kanren Get(Var x)
    {
        Kanren v;
        return subs.TryGetValue(x, out v) ? v : null;
    }

    internal State Extend(Var x, Kanren v)
    {
        return new State
        {
            subs = subs.Add(x, v),
            next = next,
            incomplete = incomplete,
        };
    }

    internal State Next()
    {
        return new State
        {
            subs = subs,
            next = next + 1,
            incomplete = incomplete,
        };
    }
    ...
}

Each new variable introduced by Kanren.Exists increments the unique variable index we track in State.next, so each variable gets a unique number. Variable bindings are tracked in an immutable hash map provided by Sasa.Collections.Trie<T> (a fast immutable IDictionary).

µKanren Operators

There are 3 core operations on Goals, constructing a Goal with Kanren.Exists, and combining Goals using either disjunction or conjunction:

public sealed class Kanren
{
    ...
    public static Goal Exists(Func<Kanren, Goal> body)
    {
        var arg = body.Method.GetParameters()[0].Name;
        return state =>
        {
            var goal = body(new Kanren { id = state.next, Name = arg });
            return goal(state.Next());
        };
    }

    public static Goal Conjunction(Goal left, Goal right)
    {
        return state => left(state).SelectMany(x => right(x));
    }

    public static Goal Disjunction(Goal left, Goal right)
    {
        return state => left(state).Concat(right(state));
    }

    public static Goal Recurse(Func<Kanren, Goal> body, Kanren x)
    {
        return state => new[]
        {
            new State
            {
                subs = state.subs,
                next = state.next,
                incomplete = () => body(x)
            }
        };
    }
    ...
}

Kanren.Exists just creates a fresh variable using the next number in the sequence, and invokes the function body that produces the Goal. The Goal is then evaluated using the given State, but with an updated variable counter to ensure variables are unique.

Conjunction consists of equations like x == 6 & y == 7. For every State generated by the 'left' Goal, conjunction generates a set of States built from applying the 'right' Goal to that same state. So, x == 6 produces a binding assigning x to 6, which is then passed to the Goal for y == 7, which then adds a binding for y = 7. Thus, "x[0] = 6, y[1] = 7" is produced.

Disjunction consists of equations like x == 6 | x == 7. These Goals are independent and don't interfere with each other, and so any States generated by 'right' are simply appended to the States generated by 'left'. The 'left' Goal produces x = 6, and 'right' produce x = 7, so the result will be:


x[0] = 6,
x[0] = 7,

Because C# features eager evaluation, we have to handle recursion in Kanren expressions manually. We do this via Kanren.Recurse, which takes a Goal constructor and a kanren variable, and returns an "incomplete" State, which basically means a State that was only partially evaluated and may produce further states if you resume evaluation:

public sealed class State
{
    ...
    // True if this state is final, such that all bindings have values,
    // false if some computation remains to be done.
    public bool IsComplete
    {
        get { return incomplete == null; }
    }

    // Get the pairs of bound variables and their values.
    public IEnumerable<KeyValuePair<Kanren, object>> GetValues()
    {
        return subs;
    }

    // Return the remaining stream of states, if any.
    public IEnumerable<State> Continue()
    {
        if (IsComplete) throw new InvalidOperationException("State is complete.");
        return incomplete()(this);
    }
}

The Kanren client is currently responsible for handling incomplete States while iterating over the results. It's possible to handle this in a more uniform manner with a little more execution overhead, but uKanren.NET doesn't currently do so.

Unification

Finally, the real meat of logic programming is resolving equality between logic variables and values. Logic programming solves equations using an algorithm called "unification":

public sealed class Kanren
{
    ...
    static object Walk(object uvar, State env)
    {
        // search for final Kanren binding in env
        Kanren v;
        while (true)
        {
            v = uvar as Kanren;
            if (ReferenceEquals(v, null)) break;
            var tmp = env.Get(v);
            if (ReferenceEquals(tmp, null)) break;
            uvar = tmp;
        }
        return uvar;
    }

    static State Unify(object uorig, object vorig, State s)
    {
        var u = Walk(uorig, s);
        var v = Walk(vorig, s);
        var uvar = u as Kanren;
        var vvar = v as Kanren;
        if (!ReferenceEquals(uvar, null) && !ReferenceEquals(vvar, null)
            && uvar.Equals(vvar))
            return s;
        if (!ReferenceEquals(uvar, null))
            return s.Extend(uvar, v);
        if (!ReferenceEquals(vvar, null))
            return s.Extend(vvar, u);
        if (u.Equals(v))
            return s;
        return null;
    }
    ...
}

Unification brings together all equality declarations between variables and values and ensures they are all consistent. If two variables are unified, then they are either the same variable, or the variable values must unify. If a variable and value are unified, then that value is bound to that variable in the State by extending it. If two values are unified, then they must be equal to each other. If none of these conditions hold, then a null State is returned, indicating an inconsistent set of equality conditions.

Wrap Up

The real uKanren.NET implementation provides a few tweaks to interleave certain streams, and to provide convenient operators on Goals so you can easily combine more sophisticated expressions:

static Goal Fives(Kanren x)
{
    return x == 5 | Kanren.Recurse(Fives, x);
}

static Goal Sixes(Kanren x)
{
    return 6 == x | Kanren.Recurse(Sixes, x);
}

static Goal FivesAndSixes()
{
    return Kanren.Exists(x => Fives(x) | Sixes(x));
}

// output of FivesAndSixes:
// x[0] = 5,
// x[0] = 6,
// x[0] = 5, x[0] = 5, ... [infinite stream of fives]
// x[0] = 6, x[0] = 6, ... [infinite stream of sixes]

However, the implementation described above covers the core operation of µKanren, and the rest is just syntactic sugar to make it nice in C#.

[1] The "Exists" operator was called "call/fresh" in the original paper and in most other implementations. I found it easier to read off uKanren expressions as English with the form I settled on.

NHibernate: Associations with Composite Primary Keys as Part of a Composite Primary Key

2015-01-21T15:12:00.000-05:00

NHibernate is a pretty useful tool, but occasionally it's not entirely documented in a way that makes it's flexibility evident. Composite keys are a particularly difficult area in this regard, as evidenced by the numerous articles on the topic.

Most of the existing articles cover this simply enough, but there is one uncommon corner case I have yet to see explained anywhere: a composite primary key one of whose key properties is an association with a composite key. This is probably pretty uncommon and there are ways around it, hence the lack of examples, but as a testament to NHibernate's flexibility, it's possible! Here's the example in code listing only the primary keys:

public class MotorType
{
  public Horsepower Horsepower { get; protected set; }
  public VoltageType VoltageType { get; protected set; }
}
public class Motor
{
  public MotorType MotorType { get; protected set; }
  public Efficiency Efficiency { get; protected set; }
}

The tables look like this:

MotorType
HorsepowerId	VoltageTypeId

Motor
HorsepowerId	VoltageTypeId	EfficiencyId

Most people would probably map this with the Motor class having the three primary key properties, one for each column, and an additional many-to-one association referencing MotorType, but that shouldn't actually be necessary. Composite primary keys are possible, and the primary key can contain an association, therefore it should be possible, in principle, for that association to itself need a composite key.

And here's how it's done for the Motors table:

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2">
  <class name="Motor, ExampleProject" table="Motors">
    <composite-id>
      <key-many-to-one name="MotorType">
        <column name="HorsepowerId" />
        <column name="VoltageTypeId" />
      </key-many-to-one>
      <key-property name="Efficiency" column="EfficiencyId" />
    </composite-id>
  </class>
</hibernate-mapping>

The MotorType class is a simple composite primary key:

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2">
<class name="MotorType, ExampleProject" table="MotorTypes">
    <composite-id>
      <key-property name="Horsepower" column="HorsepowerId" />
      <key-property name="VoltageType" column="VoltageTypeId" />
    </composite-id>
</class>
</hibernate-mapping>

Generalized Multicast Delegates in Pure C#

2014-10-11T11:22:00.001-04:00

.NET's delegates are a powerful and convenient abstraction available since .NET 1.1. They encapsulate a method pointer and the object the method belongs to in a single callable "function object". Multicast delegates are an extension of plain delegates, in that they encompass multiple single delegates. Invoking a multicast delegate invokes every encapsulated delegate, in the order they were added. Multicast delegates are key to event handling patterns that were a core part of the CLR nearly since its inception.

If you're curious about virtual machines or CLR internals, you've perhaps wondered how multicast delegates actually work. These are the key properties of multicast delegates:

They encapsulate a list of delegates of the same delegate type.
Adding a delegate returns a new multicast delegate containing the new addition at the end of the list (the original is unchanged).
Removing a delegate returns a new multicast delegate without the specified delegate (the original is unchanged).
Invoking a multicast delegate invokes each encapsulated delegate, in order, with the given parameters.

These properties are a straightforward immutable queue of delegates, and if we just wanted something that works, we could just use an immutable queue like Sasa.Fifo<T>. However, the CLR's multicast delegates are insanely fast to invoke by comparison. Here's a simple timing result:

    = System.MulticastDelegate =
    Build           =       1,390,819
    Run             =          53,762

    = Delegate Immutable Queue =
    Build           =         507,871
    Run             =         527,292

This builds up a multicast delegate of 20 items about 200,000 times, then executes that delegate 200,000 times. Building multicast delegates is quite slow on the CLR, but invoking them is an order of magnitude faster than naively traversing a queue and invoking each delegate. Presumably, invoking delegates happens more often than adding and removing delegates, so better invocation speed is a good tradeoff. So how do we achieve it?

First, let's consider the fastest possible implementation of invocation: arrays. If we preallocate an appropriately sized array of delegate slots, and invoke each entry in a for loop, we get these timings:

    = Delegate Array =
    Build           =          71,554
    Run             =          54,022

Exactly equal to multicast delegates, to within reasonable experimental error. So now we know we need some data structure that approximates a queue, but uses contiguous memory for fast traversals during delegate invocation. We can simulate an immutable array queue using Sasa's array combinators (see Append and Remove):

public struct MulticastSimple<T>
    where T : class
{
    internal T[] items;

    public MulticastSimple&lgt;T> Add(T item)
    {
        return new MulticastSimple<T>
        {
            items = items == null ? new[] { item } : items.Append(item),
        };
    }

    public MulticastSimple<T> Remove(T item)
    {
        if (items != null)
        {
            var i = Array.IndexOf(items, item);
            if (i >= 0) return new MulticastSimple<T>
            {
                items = items.Length == 1 ? null : items.Remove(i)
            };
        }
        return this;
    }
}

It's pretty simple, but let's see how it compares with the built-in multicast:

    = Multicast Delegate =
    Build           =       1,334,914
    Run             =          57,646

    = MulticastSimple =
    Build           =         860,529
    Run             =          58,011

Pretty good! Building MulticastSimple is much faster, and invocation is just as fast as the built-in multicast delegates. Remember that these numbers are for 20 delegates though, and given the cost of adding delegates is O(n), this won't scale very well to large numbers of delegates. Let's see what the numbers look like for 200 delegates iterated 20,000 times:

    = System.MulticastDelegate =
    Build           =       1,324,381
    Run             =          45,683

    = MulticastSimple =
    Build           =       2,545,824
    Run             =          45,497

MulticastSimple is already more expensive than multicast delegates, so now let's see 2,000 delegates iterated 2,000 times:

    = System.MulticastDelegate =
    Build           =       1,324,422
    Run             =          45,293

    = MulticastSimple =
    Build           =      21,115,099
    Run             =          51,515

And here we see the limitations of the naive approach. The cost of generating multicast delegates appears constant regardless of the number of delegates, but the cost for MulticastSimple scales linearly with the number of delegates. However, even 20 delegates in a multicast delegates is uncommon, so optimizing for the rare cases probably isn't worth the effort.

A Scalable Multicast up to M

Fortunately, there is a simple extension of the above design which will efficiently support up to M delegates, where M is some configurable number. Basically, efficient delegate addition needs a single small array, and efficient traversal needs an array as well, but there's no need for these two arrays to be the same:

public struct Multicast<T>
    where T : class
{
    internal T[][] items;
    internal T[] end;

    public Multicast<T> Add(T item)
    {
        if (items == null) return new Multicast<T> { end = new[] { item } };
        if (items.Length < 16) return new Multicast<T>
        {
            items = items, end = end.Append(item)
        };
        return new Multicast<T>
        {
            items = items == null ? new T[][] { end } : items.Append(end),
            end = new[] { item },
        };
    }

    public Multicast<T> Remove(T item)
    {
        if (items != null)
        {
            var i = Array.IndexOf(items, item);
            if (i >= 0) return new Multicast<T>
            {
                items = items, end = end.Length == 1 ? null : end.Remove(i)
            };
        }
        else if (items != null)
        {
            for (int i = 0; i < items.Length; ++i)
            {
                var j = Array.IndexOf(items[i], item);
                if (j >= 0) return new Multicast<T>
                {
                    items = items[i].Length == 1
                          ? items.Remove(i)
                          : items.Set(i, items[i].Remove(j))
                };
            }
        }
        return this;
    }

    public T[][] Items { get { return items; } }
    public T[] End { get { return end; } }

    public static Multicast<T> operator +(Multicast<T> lhs, T rhs)
    {
        return lhs.Add(rhs);
    }

    public static Multicast<T> operator -(Multicast<T> lhs, T rhs)
    {
        return lhs.Remove(rhs);
    }
}

So we keep a small array up to 32 items for efficient appends (field 'end'), and when 32 or more delegates are added this overflows into a nested array (field 'items') and the 'end' array is restarted. So appends have the same low constant factor overheads as the MulticastSimple, but scale at linearly at a factor of n/32 instead of just n. This is still fundamentally O(n), but the constant factor overheads of builtin multicast delgate appends are so high that they only beat out this implementation around 10,000 delegates:

    = System.MulticastDelegate =
    Build           =       1,479,262
    Run             =          44,318

    = Multicast<T> =
    Build           =       1,778,939
    Run             =          58,599

A multicast delegate with 10,000 entries is more than any reasonable program will ever need. At 200 delegates:

    = System.MulticastDelegate =
    Build           =       1,445,167
    Run             =          44,934

    = Multicast<T> =
    Build           =         949,040
    Run             =          57,367

At 20 delegates:

    = System.MulticastDelegate =
    Build           =       1,411,269
    Run             =          55,544

    = Multicast<T> =
    Build           =         853,145
    Run             =          57,036

And probably the most likely case, 4 delegates:

    = System.MulticastDelegate =
    Build           =         979,093
    Run             =          80,781

    = Multicast<T> =
    Build           =         632,594
    Run             =          77,277

Multicast Invocation

Normally the C# compiler produces the code needed to invoke any delegate type, and we can achieve something similar with T4 templates and reusing the built-in System.Action overloads. First, let's look at the basic invocation for System.Action:

public static void Invoke(this Multicast<Action> m)
{
    for (var i = 0; m.items != null && i < m.items.Length; ++i)
    {
        var x = m.items[i];
        for (var j = 0; j < x.Length; ++j)
        {
            x[j]();
        }
    }
    for (var i = 0; m.end != null && i < m.end.Length; ++i)
      m.end[i]();
}

Pretty straightforward: just iterate over each nested array and invoke the delegates. The invocation for an action with a parameter consists of just adding the parameter to invoke:

public static void Invoke<T0>(this Multicast<Action<T>> m, T0 arg0)
{
    for (var i = 0; m.items != null && i < m.items.Length; ++i)
    {
        var x = m.items[i];
        for (var j = 0; j < x.Length; ++j)
        {
            x[j](arg0);
        }
    }
    for (var i = 0; m.end != null && i < m.end.Length; ++i)
        m.end[i](arg0);
}

And so on for all System.Action overloads up to 16 type arguments. We can automate generating these overloads using T4 templates.

The Semantics of Multicast System.Func

Multicast delegates on the CLR only have proper semantics for delegates that return no value. If you combine multiple delegates that return a value, then only the value from the last delegate is ever returned. But in principle, executing a list of delegates should generate a list of results. And the advantage of pure C# multicast delegates is that implementing this semantics is trivial!

public static IEnumerable<T> Invoke<T>(this Multicast<Func<T>> m)
{
    for (var i = 0; m.items != null && i < m.items.Length; ++i)
    {
        var x = m.items[i];
        for (var j = 0; j < x.Length; ++j)
        {
            yield return x[j]();
        }
    }
    for (var i = 0; m.end != null && i < && m.end.Length; ++i)
        yield return m.end[i]();
}

This is basically the same C# code as Action invocations, but we now exploit C#'s generator syntax to easily return lists of values while executing all the encapsulated delegates. Like System.Action, we can generate all the necessary overloads for System.Func using T4 templates.

Summary

We haven't reproduced precisely the internal multicast implementation, but we have something incredibly simple that's just as fast for all foreseeable programs, and it's in safe C# to boot. Furthermore, we've extended multicast delegates to include a more coherent and complete semantics for delegates that return values, all in about 150 lines of code.

You can download the full source code for this post here, including MulticastSimple and Multicast with T4 templates for the latter.

Sasa v0.14.0 Released

2014-09-25T16:03:00.000-04:00

A quick release, fixing one bug and adding some performance improvements and some new collection types. Here's the changelog:

 * added a faster and more compact hash array mapped trie under Sasa.Collections.Trie to replace Sasa.Collections.Tree
 * added Sasa.Collections.FingerTree
 * added purity attributes throughout Sasa (for .NET 4)
 * expanded various test suites and documentation
 * added a persistent vector, Sasa.Collections.Vector
 * factored out Atomics.BeginRead/BeginWrite read/write protocol so it can be used in more flexible scenarios
 * Sasa.Dynamics.Type.MaybeMutable is now computed eagerly
 * added an efficient, mutable union-find data structure
 * fixed: Uris.UriDecode now handles the incorrect but common UTF16 encoding
 * moved Sasa.LockSet under Sasa.Concurrency.LockSet

The Sasa.Collections.Tree type is no longer available, and is replaced by the faster and more appropriately named Sasa.Collections.Trie. The original tree was actually a trie, and I didn't want it confused with a future search tree type which provides different operations. The FingerTree is an experimental addition, as is Sasa.Collections.Vector which provides a fast immutable array.

Edit: the online docs for the new release are also available here.