Tuesday, April 3, 2007

Visitor pattern considered harmful (unless functionalized)

To kick off my first post, I thought I'd cover a post on another blog that I found interesting.

Basically, the gist of the post is that object oriented visitor pattern does not lend itself to the problem of walking trees, and general symbolic manipulation, despite that being its claim to fame. In general, algebraic data types and pattern matching are much simpler, and more concise.

I agree, however I think that post goes a bit too far in saying that visitors simply can't do this elegantly, because in fact there's a straightforward translation of a pattern matching function into a visitor, as long as you keep your objects "functional", ie. treat them as variants.

For instance, let's say we are constructing an evaluator:

//the Expression type
public abstract class Exp
public abstract Val Visit(IVisitor v);

//the type of Values
public abstract class Val : Exp

//the type representing an application
public sealed class App : Exp
Exp e0;
Exp e1;

public App(Exp e0, Exp e1)
this.e0 = e0;
this.e1 = e1;

public override Val Visit(IVisitor v)
return v.App(e0, e1);

//the pattern matching visitor
public interface IVisitor
Val App(Exp e0, Exp e1);

This is slightly different from the typical visitor pattern you see in books, where the object itself would be passed in to IVisitor.App(). In fact, this looks almost exactly like a pattern matching function in a language like OCaml:

type Exp = App of Exp * Exp | Val   (* | ... *)

let rec eval exp =
match exp with
App e0 e1 -> (* ... evaluate the application *);;

Thus, properly designed, a visitor is simply a pattern matching function. By convention, we use the class name as the visitor's method, so it matches with the convention in functional language's (ie. the constructor name). The visitor method's parameters are the object's private fields, just like a variant. I'm using this approach for an interpreter I'm implementing and it works quite well (unfortunately no source is publicly available at the moment).

In terms of efficiency, the pattern matching function is probably faster. After all, it's simply a tag check plus a branch, whereas the visitor must perform a double dispatch (two method lookups incurring two indirections and two indirect branches).

Clearly, the visitor is also far more verbose, as you must create the "variant classes", manually add a "Visit" method to each, and since there is no "default" case as in pattern matching, like the "default" case in a switch statement, you have to exhaustively handle each variant explicitly in each visitor, even if the case isn't relevant. You can mitigate the latter problem by implementing a "Default" abstract base class with dummy implementations on all cases, and then override only the cases you are interested in:

public abstract class Default : IVisitor
//overridden in my Interpreter visitor
public virtual Val App(Exp e0, Exp e1)
throw new NotSupportedException();

As a potentially interesting extension, the variant classes may not even have to be in the same inheritance hierarchy, they only have to implement a marker interface with the appropriate Visit() method. This may increase extensibility along similar lines to polymorphic variants; only the visitor has to be kept up to date with all the relevant cases, and it can evolve separately from the core.


So if you're doing symbolic manipulation, and are forced to operate in a OO language like C# above, you can still approximate the elegance of algebraic data types and pattern matching, it just takes a little more work on your part. Pretty much par for the course in the OO world. :-)

No comments: