Skip to main content

Visitor Pattern Deprecated: First-Class Messages Are All You Need!

In previous posts [1,2], I argued that the visitor pattern was a verbose, object-oriented equivalent of a pattern matching function. The verbosity stems from the need to add the dispatching code to each data class in the hierarchy, and because the encapsulation inherent to objects is awkward when dealing with pure data.

A single addition to an OOP language could completely do away with the need for the dispatching code and make OO pattern matching simple and concise: first-class messages (FCM). By this I mean, messages sent to an object are themselves 'objects' that can be passed around as parameters.

To recap, functional languages reify the structure of data (algebraic data types), and they abstract operations (functions). OOP languages reify operations (interfaces), but they abstract the structure of data (encapsulation). They are duals of one another.

All the Exp data classes created in [1,2] shouldn't be classes at all, they should be messages that are sent to the IVisitor object implementing the pattern matching:

data Exp = App(e,e) | Int(i) | ...

IVisitor iv = ...

//the expression: 1 + 2
Exp e = App(Plus(Int(1), Int(2)));

//'e' is now the "App" operation placed in a variable
iv.e //send "App" to visitor

FCM removes the inefficient double-dispatch inherent to the visitor pattern, while retaining object encapsulation for when you really need it; the best of both worlds.

Note also how the above message declaration looks a great deal like a variant in OCaml? This is because first-class messages are variants, and pattern-matching functions are objects. I'm actually implementing the DV language in that paper, first as an interpreter, then hopefully as a compiler for .NET. To be actually useful as a real language, DV will require some extensions though, so stay tuned. :-)

[Edit: made some clarifications to avoid confusion]

Comments

Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C!Features:It's 100% portable C.It requires very little state (2 bytes).It's not dependent on an OS.It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved.#include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground breaking. I've mad…

Building a Query DSL in C#

I recently built a REST API prototype where one of the endpoints accepted a string representing a filter to apply to a set of results. For instance, for entities with named properties "Foo" and "Bar", a string like "(Foo = 'some string') or (Bar > 99)" would filter out the results where either Bar is less than or equal to 99, or Foo is not "some string".This would translate pretty straightforwardly into a SQL query, but as a masochist I was set on using Google Datastore as the backend, which unfortunately has a limited filtering API:It does not support disjunctions, ie. "OR" clauses.It does not support filtering using inequalities on more than one property.It does not support a not-equal operation.So in this post, I will describe the design which achieves the following goals: A backend-agnostic querying API supporting arbitrary clauses, conjunctions ("AND"), and disjunctions ("OR").Implementations of this…

Easy Reverse Mode Automatic Differentiation in C#

Continuing from my last post on implementing forward-mode automatic differentiation (AD) using C# operator overloading, this is just a quick follow-up showing how easy reverse mode is to achieve, and why it's important.Why Reverse Mode Automatic Differentiation?As explained in the last post, the vector representation of forward-mode AD can compute the derivatives of all parameter simultaneously, but it does so with considerable space cost: each operation creates a vector computing the derivative of each parameter. So N parameters with M operations would allocation O(N*M) space. It turns out, this is unnecessary!Reverse mode AD allocates only O(N+M) space to compute the derivatives of N parameters across M operations. In general, forward mode AD is best suited to differentiating functions of type:RRNThat is, functions of 1 parameter that compute multiple outputs. Reverse mode AD is suited to the dual scenario:RN → RThat is, functions of many parameters that return a single real …