Friday, December 28, 2007

ML Modules in C#

I've written about the limitations of C#'s equational constraints before. Truth is, I now believe that any such limits can be circumvented by a relatively simple translation. The result is less "object-oriented", as it requires a set of cooperating objects instead of being encapsulated in a single object.

Let's take a simple, unsafe list flattening operation described in Generalized Algebraic Data Types and Object-Oriented Programming (GADTOOP). This can be expressed in OCaml as:
let List = struct 
type 'a t = Nil | Cons of 'a * 'a t
let append l a = Cons(a, l)
let flatten la =
Cons(a, l) -> append a (flatten l)
| Nil -> Nil

The argument to flatten, la, is a list of lists of type 'a. However, there is no way to express this in C# without unrestricted equational constraints as I described earlier. Here is the translation to C# from GADTOOP:
public abstract class List<T> {...
public abstract List<T> Append(List<T> that);
public abstract List<U> Flatten<U>();
public class Nil<A> : List<A> {...
public override List<U> Flatten<U>()
{ return new Nil<U>(); }
public class Cons<A> : List<A> {...
A head; List<A> tail;
public override List<U> Flatten<U>()
{ Cons<List<U>> This = (Cons<List<U>>) (object) this;
return This.head.Append(This.tail.Flatten<U>()); }

Note how Flatten requires an unsafe, ugly cast to and from object to get around C#'s type limitations. However, there is a safe way. Here is the translation:
static class List
public abstract class t<A> {}
public sealed class Nil<A> : t<A> {}
public sealed class Cons<A> : t<A> {
internal A head; internal t<A> tail;
public static t<A> Append(t<A> l, A a)
{ return new Cons<A>(a, l); }
public static t<A> Flatten<A>(t<List<A>> l)
{ return l.head is Nil ? new Nil<A>() : Append<A>(Flatten<A>(l.tail), l.head); }

Basically, the step-wise translation I propose:
  1. move the generic class I into a nested generic class O.I
    class I<T> {}
    class O {
    class I<T> {}

  2. move the instance methods of that generic class into generic methods of the enclosing class, I.m -> O.m
    class I<T> {
    I<T> m<U>(U u) { ... }
    class O {
    class I<T> {}
    I<T> m<T, U>(I<T> i, U u) { ... }

  3. mark all relevant state of I "internal", so the outer class O can access it.
    class I<T> {
    private T value;
    public T get() { return value; }
    class O {
    class I<T> { internal T value; }
    public T get(I<T> i) { return i.value; }

  4. now you can express arbitrary constraints on the type structure in the methods of O.
    class O {
    class I<T> {}
    I<T> m2<T, U>(I<I<T>> i, I<I<U>> u) { ... }

Because the external methods do not depend on an implicit 'this' argument, we no longer need equational constraint refinements to express the required type structure. Ordinary C# types suffice!

If you squint a little more closely, you'll also note a much closer symmetry in this version of the code and the OCaml code. In fact, this translation essentially builds ML modules in C#!

Like all widely-used module systems, it's pretty clear that these modules are "second-class". Only rebinding the name is possible, via C#'s "using" directive, and everything is resolved and fixed at compile-time. By this, I mean that you can import different list implementations and it will compile cleanly as long as the other list implementation defines the same operations with the same type signatures:
using L = List;
using L = AnotherList;

The compile-time restriction is primarily due to the declaration as a static class. Making the class non-static permits runtime binding of concrete implementations to signatures, so it's a little more flexible, and just as safe. Loosening this restriction may also make runtime "functors" possible.

I've used this pattern to complete Orc.NET, because the 'where' combinator required an inexpressible dependency. You can see my use in the Orc interpreter. The "Orc" object in Interp.cs is essentially an "expression builder", and I suspect that all such "builder" implementations are really ML modules at their core.

An open question is the interaction of inheritance with such modules. Seems like inheritance is a particular type of functor from structure S to S.

In any case, if you need type constraints which are inexpressible in C#, then make them ML modules using the above translation, and add object-orientedness back in incrementally. On a final note, I find it amusing that OO languages must resort to functional techniques to resolve fundamental OO limitations. I'd much prefer if we could just use functional languages instead and forgo all the hassle. ;-)

Monday, November 12, 2007

Generalizing C# Generics

In previous posts, I had commented on certain non-sensical limitations in the C# type system, particularly with regard to equational constraints on generic type parameters; these unfortunate limitations significantly reduce the expressiveness of well-typed solutions.

Microsoft Research had actually already tackled the problem in their 2006 paper Variance and Generalized Constraints for C# Generics. Taking inspiration from Scala, they generalize class and method parameter constraints with arbitrary subtyping relations, and they further add use-constraints on generic methods. This increased expressiveness should address the problems I alluded to in my previous posts; if only the changes were integrated into the .NET VM and C#... :-)

[Edit: figures that LTU already covered this paper]

Wednesday, October 24, 2007

On the Importance of Purity

The benefits of advanced programmings languages are sometimes difficult to grasp for everyday programmers. The features of such languages and how they relate to industrial software development are sometimes hard to understand, especially since the arguments are couched in terms such as "referential transparency", "totality", "side-effect-free", "monads", "non-determinism", "strong static typing", "algebraic data types", "higher-order functions", "laziness/call-by-need", and so on.

Many of these features are attributed to "pure" languages, but purity is also a nebulous concept. I will explain the importance of a number of these features and how they impact the everyday programmer's life.

The Benefits of Referential Transparency

Referential transparency (RT) is a simple principle with profound consequences. Essentially, RT dictates that functions may access only the parameters they're given, and the only effect functions may have, is to return a value.

This may not sound very significant, but permitting functions which are not
RT is literally a nightmare for software developers. This fact is simply not obvious because most mainstream languages are not referentially transparent.

Consider a fairly benign scenario where you're in a loop and you call a function, but the function suddenly changes your loop index variable even though you didn't pass it to the function! This generally doesn't happen, but it could happen in languages like C/C++. The only reason it's not prevalent is because functions are RT regarding local variables allocated on the stack, and most languages other than C/C++ enforce this property.

I'm sure it's not difficult to imagine the horrors if any function you call could modify any of your local variables: you could no longer rely on the variables holding correct values at any time. How could you rely on anything? Mutable global variables are a good example of this mess. It's common wisdom that one should avoid globals. Why? Because they're not RT, and since their values could change at any time, they're unreliable.

If a client calls you with a problem in your program written in a RT language, you immediately know that the problem is exactly in module B providing that feature, and perhaps even the particular function B.F performing the given operation. Instead, in non-RT languages a completely different module C could be interfering with module B by changing its state behind the scenes. Debugging is thus much easier with RT.

Consider a more malicious scenario, where a function does some useful computation, but also deletes all of your files behind your back. You download this library because it seems useful, only to lose your entire machine. Even worse, it may only deletes files when deployed to a server. Or perhaps it installs some adware. This situation is only possible because the function is not referentially transparent. If it were, the only way it could have deleted your files is if you had given it a handle to a directory.

Those of you well-versed in the security field might recognize this constraint: it is the same authority propagation constraint underlying capability security.

However, the properties of full referential transparency are stronger than those of capability security: where the latter permits non-determinism as long as the facility providing it is accessed via a capability, full referential transparency requires "pure functions", which are fully deterministic.

The Benefits of Determinism

So why is determinism important? It's important that everything in your program is reproducible, and thus testable. No arguments with that statement I'm sure.

Consider a particular function, that whenever called with the same parameters, it returns a completely different value. How could you ever produce a useful program if every function were like that? How could you even test it?

Most developers have experienced the problems of non-determinism when a client calls in with a problem, but the problem is not reproducible even by retracing the exact same steps. If the program was deterministic, then retracing those exact steps would always produce the error, no exception. I can't understate how essential reproducibility is for testing and quality assurance purposes.

However, non-deterministic functions do exist. We've all used such functions at one time or another. Consider a function that returns the value of the clock, or consider the random() function. If we want to use a RT, deterministic language, we need some way to use non-deterministic functions. How do we reconcile these two conflicting ends? There are a number of different ways, but all of them involve controlling and isolating non-determinism in some way.

For instance, most capability languages permit non-determinism, but the source of non-determinism can only be accessed via a capability which must be explicitly granted. Thus, you know that only the modules that were granted access to a source of non-determinism can behave non-deterministically and every other module in the program is completely deterministic. What a relief when debugging!

Essentially, this means that any source of non-determinism cannot be globally accessible, or ambient. So for you C/C++ programmers, rand() and getTime() are not global functions, they are function pointers that must be passed explicitly to the modules that need them. Only main() has access to all sources of non-determinism, and main() will pass on the appropriate pointers to the authorized modules.

Purely functional languages like Haskell take a similar approach to capability languages. RT is absolutely essential to purely functional languages, and any source of non-determinism violates RT. Such languages have a construct which was mathematically proven to isolate and control non-determinism: monads. It's not important what a monad is, just consider it as a design pattern for purely functional languages that preserves RT in the presence of non-determinism.

I would say that determinism is strictly more important than RT, but that RT is currently the best known method of achieving determinism. Now if your client calls you with an irreproducible problem, at least you've narrowed the field to only the modules that use sources of non-determinism.

The Benefits of Totality

What is a total function? A function is total if it is defined for all possible values of its inputs, no exception. Integer addition is a total function: regardless of the values of the two input integers, adding them produces a valid result.

By contrast, many functions are partial functions. Here, "partial" means that not all values passed to the function are valid. For instance, integer division is defined for all integers except zero. Dividing by zero is considered an error and its behaviour is undefined. Division is thus a partial function.

Using total functions always results in defined behaviour. Using partial functions sometimes results in undefined behaviour, if they are applied to invalid values. Thus, the more total functions you use, the more likely your program will run without generating errors.

If a language forces all of your functions to be total, then your program will have no undefined behaviour at all. You are forced to consider and handle all possible error conditions and edge cases, in addition to the program' expected normal operating mode. No undefined or unknown behaviour is possible.

As you can imagine, totality inevitably produces more reliable software. In C you are free to ignore errors and segfault your program, but with total functions you can't ignore those errors.

Unfortunately, totality can be a serious burden on the developer, which is why partial programming is more prevalent. Exceptions were invented to help deal with invalid inputs to partial functions: don't segfault, throw an exception! They do permit more modular code to be written, since errors can propagate and be handled at higher levels. Unfortunately, unchecked exceptions, where the exceptions a function can throw are not defined in the function's signature, just bring us back to square one: uncaught exceptions result in undefined behaviour, but the language doesn't help us by telling us what exceptions we need to catch.

Correct software inevitably requires totality of some sort. How do we transform a partial function into a total function? Here's how we can make division total:
fun divide(n,d) : int * int -> int

This snippet is the signature for the division function. It takes two integers, n and d, and returns an integer. No mention is made of the error condition in the signature when d=0. To make divide total, we transform it to the following:
data Result = Defined(x) | Undefined
fun divide(n,d) : int * int -> Result = 
 if d == 0 then
   return Undefined
   return Defined( n/d )

So now, any code that calls divide must deconstruct the return value of divide into either a Defined result x, or an Undefined result indicating an invalid input:
fun half(x) =
  match divide(x,2) with
    Undefined -> print "divide by zero!"
  | Defined(y) -> print "x/2=" + y

As you can see, a total divide function forces you to handle all possible cases, and so any program using it will never have undefined behaviour when dividing by zero. It will have whatever behaviour you specify on the Undefined branch. No efficiency is lost as any decent compiler will inline the above code, so the cost is just an integer compare against 0.

A similar strategy can be used for errors that come from outside the program as well. Consider the file open() function. It's a partial function:
fun open(name) : string -> FILE

It can be transformed it into a total function as follows:
data FileOpenResult = File f | PermissionDenied | FileNotFound | ...
fun open(name) : string -> FileOpenResult

If all partial functions are made total using the above technique, then all of your software will have fully defined behaviour. No more surprises! To a certain extent, totality even mitigates some problems with non-determinism: after all, who cares if the output is not reproducible since you are forced to handle every possible case anyway.

At first glance, totality is not appealing to the lazy programmer. But if you've ever had to develop and maintain complex software, you'll soon appreciate your language forcing you to deal with all possible cases at development time, instead of dealing with irate customers after deployment.

For all the lazy programmers out there, consider this: totality almost eliminates the need for testing. As a lazy programmer myself, that's a selling point I can support. ;-)

The Benefits of Strong Static Typing

What is strong static typing? Static typing is an analysis a compiler performs to ensure that all the uses of data and functions in a program are consistent. In other words, you're not saying X in one place, and then saying Y which contradicts X in another place.

I take strong static typing to mean a static type analysis that the developer cannot circumvent. C/C++ are statically typed, but their static type systems are not strong because you can freely cast between types, even when doing so is undefined. C#, Java, and similar languages are statically typed, but they are on the border line of strong typing: there is no way to defeat the type system, but the static analysis is weak since you can still cast. Casting just introduces dynamic runtime checks and runtime errors.

There are many arguments for and against strong static typing, but in combination with the previously discussed features, static typing enables developers to write software that is "almost correct by construction". By this I mean that if your program compiles, it will run without errors, and it has a high probability of actually being correct.

The reason for this is some computer science black magic called the Curry-Howard isomorphism. What that is exactly isn't important. Just consider a static type system to be a built-in logic machine: when you declare and use types this logic machine is making sure that everything you're telling it is logically consistent. If you give it a contradiction, it will produce a type error. Thus, if your whole program type checks, it is logically consistent: it contains no contradictions or logical errors in the statements the logic machine understands, and the logic machine constructed a proof of this fact.

The power of the static type system dictates how intelligent the logic machine is: the more powerful the type system, the more of your program the machine can test for consistency, and the closer your program is to being correct. Your program essentially becomes a logical specification of the problem you are solving. You can still make errors in the specification, but those errors must not contradict anything else in the specification for it to type check.

The downside of strong static typing is that the analysis is necessarily conservative, meaning some legitimate programs will produce a type error even though they would not generate an error at runtime.

Such is the price to pay for the additional safety gained from static typing. As a lazy programmer I'm willing to pay that price.


These are some of the less common idioms found in advanced programming languages. Other idioms are either more well known and have already been adopted into other languages (algebraic datatypes, higher-order functions), or are simply less relevant for constructing reliable software (laziness).

Absolute purity may ultimately prove to be unachievable, but its pursuit has given us a number of powerful tools which significantly aid in the development and maintenance of reliable, secure software.

[Edit: clarified some statements based on some feedback from LTU]

Monday, October 8, 2007

Reading PDFs on the iPhone the moderately difficult way

Taking a break from my programming language blogging, I thought I'd describe a recent adventure with my new iPhone: reading PDFs. The iPhone can read PDFs, but insists on doing so only when reading the PDF from the network in some way, ie. via the mail client or Safari.

Being the programming language enthusiast I am, I have plenty of papers stored locally on my machine which I'd like to read on the go, and network access is less than desirable for obvious reasons. Unfortunately, Apple disabled the obvious answer to local browsing: using the "file://" URI scheme. Very stupid of them IMO.

Fortunately, I'm an "unscrupulous" person, because I jailbroke my iPhone so I could install third-party apps. So if network access is required to view PDFs in Safari, then I'll just have to access local files over the network! The way that's been done for over 20 years is available on the iPhone: a web server.

Both Apache and LightTPD are available in, and I chose the latter; I just find the configuration less obtuse than Apache's. You will need OpenSSH installed as well. I also recommend UICtl so you can load/unload the web server only when you need it.

Once everything is installed, I ssh'd to the iPhone, opened the lighttpd config file at /usr/local/etc/lighttpd.conf, changed the root directory to point to /var/root/PDFs (or place it wherever you like), and added:
dir-listing.activate = "enable"
config line to enable directory browsing. Then using UICtl, I unloaded and reloaded lighttpd.

Finally, I opened up Safari on the iPhone and bookmarked

Voila! I have access to all my local PDFs via Safari. :-)

Browsing and viewing PDFs is very easy, unlike other schemes using the "data:" URI scheme. Of course, the setup is moderately difficult for anybody who isn't versed in the basics of unix and the command shell.

Naturally, I'd much prefer a native app, or at the very least, local browsing via the file:// URI scheme in Safari. I'm keeping my fingers crossed. :-)

Friday, September 28, 2007

Visitor Pattern Deprecated: First-Class Messages Are All You Need!

In previous posts [1,2], I argued that the visitor pattern was a verbose, object-oriented equivalent of a pattern matching function. The verbosity stems from the need to add the dispatching code to each data class in the hierarchy, and because the encapsulation inherent to objects is awkward when dealing with pure data.

A single addition to an OOP language could completely do away with the need for the dispatching code and make OO pattern matching simple and concise: first-class messages (FCM). By this I mean, messages sent to an object are themselves 'objects' that can be passed around as parameters.

To recap, functional languages reify the structure of data (algebraic data types), and they abstract operations (functions). OOP languages reify operations (interfaces), but they abstract the structure of data (encapsulation). They are duals of one another.

All the Exp data classes created in [1,2] shouldn't be classes at all, they should be messages that are sent to the IVisitor object implementing the pattern matching:

data Exp = App(e,e) | Int(i) | ...

IVisitor iv = ...

//the expression: 1 + 2
Exp e = App(Plus(Int(1), Int(2)));

//'e' is now the "App" operation placed in a variable
iv.e //send "App" to visitor

FCM removes the inefficient double-dispatch inherent to the visitor pattern, while retaining object encapsulation for when you really need it; the best of both worlds.

Note also how the above message declaration looks a great deal like a variant in OCaml? This is because first-class messages are variants, and pattern-matching functions are objects. I'm actually implementing the DV language in that paper, first as an interpreter, then hopefully as a compiler for .NET. To be actually useful as a real language, DV will require some extensions though, so stay tuned. :-)

[Edit: made some clarifications to avoid confusion]

Friday, June 15, 2007

Objects are Existential Packages

There's a long-running debate on the power of functional languages over object-oriented languages. In truth, now that C# has full generics, aka parametric polymorphism, it's almost equivalent in typing power to most typical statically typed functional languages. In fact, in terms of typing power, generic objects are universal and existential types, and can be used for all the fancy static typing trickery that those entail (see my previous reference to the GADTs in C# to see what I mean).

As a prior post explained, C#'s type system still lacks some of the flexible constraint refinement available in more powerful functional type systems, but in general C# is powerful enough to encode most interesting functional abstractions.

And I started a new project to demonstrate this: FP#. It provides a number of widely used functional abstractions, like the option type, a lazy type, lists, lazy lists, etc. and map, filter, and fold over all the collection types, including the standard .NET collections API. Each of these abstractions will be available as a separate DLL, so instead of linking to a large library, you can just pick those abstractions you're interested in using.

Besides more flexible typing as in GADTs, expressiveness is the only advantage functional languages still have over C#. Compare the verbosity of the C# option type:
//T is the type variable
public abstract class Option<T> { }
public sealed class None<T> : Option<T> { }
public sealed class Some<T> : Option<T>
  T value;
  public Some(T v) { value = v; }

  public T Value
    get { return value; }
as compared to an O'Caml definition:
type 'a option = None | Some of 'a    (* 'a is the type variable *)
This contrast highlights my previous argument in favour of expressiveness; just think of it as 1 line of O'Caml generating 12 lines of C#, since the efficiency of both definitions is equivalent.

To demonstrate the power of existential packages and universal types in C#, I'll be including a number of statically typed abstractions that have only been found in O'Caml and Haskell to date; types like statically sized lists and arrays, number-parameterized types, and other type wizardry resulting in strong partial correctness properties (see: Lightweight Static Capabilities).

It will be particularly interesting to compare the efficiency of a sized type, like a list, to its unsized counterpart, because .NET does not erase types like Java and O'Caml do.

Monday, May 28, 2007

Expressiveness: what's all the fuss?

I just read a blog post recommending that developers broaden their language horizons, with a particular emphasis on Ruby. The author attempts to explain why expressiveness is an important metric for a language:
What is this obssesion[sic] with "expressiveness"? Go write poertry [sic] if you want to be expressvive.[sic]
Remember that ultimately our jobs are (usually) to solve some kind of business problem. We're aiming for a finish line, a goal. The programmer's job is translate the language of the business person to the language of the computer.

The whole point of compilers, interpreters, layers of abstraction and what-not are to shorten the semantic distance between our intent and the way the computer thinks of things.

To be honest, this is not very convincing; the moment you mention "semantics", is the moment many developers will close your blog and go do something "productive".

The argument for expressiveness is ultimately quite simple: the more expressive the language, the more you can do in fewer lines of code. This means that 3 lines of Ruby code might take 12 lines in C#, and 15 lines of C# could be compressed to 2 lines of Ruby while retaining readability and maintainability [1].

Consider viewing Ruby as a C# code generator, where 2 lines of Ruby code can generate 12 lines of C#. That actually sounds like a pretty good idea doesn't it? You would never write a generator to go the other way around though would you?

More expressive languages also tend to be simpler and more coherent. There are all sorts of little ad-hoc rules to Java and C# that you would not find in most functional languages.

You can readily see the differences in expressiveness at the Alioth language shootout. Set the metrics to CPU:0, memory:0, GZIP:1, which means we only care about the GZIP'd size of the source code. You'll see that functional languages tend to come out on top. Ironically, Ruby is first lending weight to the above blog post on Ruby's expressiveness.

Expressiveness is the whole driving force behind DSLs: a DSL is more expressive in solving the problems it was designed for. For instance, a relational database DSL specific to managing blogs could generate perhaps 100 lines of SQL per single line of DSL code. It would take you far more code to emulate that 1 DSL line in C#.

So the expressiveness argument is quite compelling if you simply view it as a code generation metric: the more expressive the language, the more code it can generate for you for the same number of characters typed. That means you do your job faster, and more importantly, with fewer errors.

Why fewer errors? Studies have shown that developers generally write about 20-30 correct lines of code per day. That means 20 lines of correct C# code or 20 lines of correct Ruby code. It just so happens that those 20 lines of correct Ruby can generate 100 lines of correct C#, which means you're now 5x more productive than your C#-only developer next door. Do you see the advantages now?

[1] These numbers are completely bogus and are used purely for illustrative purposes.

Wednesday, May 9, 2007

GEXL Lives!! Solving the Expression Problem in C#

GEXL is a general expression library providing core primitives for building and processing expression trees. I had abandoned it awhile ago because I quickly ran into a serious, well-known issue: The Expression Problem.

Basically, the expression problem boils down to the fact that typical programming solutions are extensible in only a single direction: either the data types or the operations defined on them, but not both, can be extended without modifying existing code. It takes some sophisticated type machinery to safely solve the expression problem, and various languages, including OCaml, Haskell, Beta/gBeta, and Scala, have acceptable solutions. Languages supporting multimethods support less type safe extension, so I exclude them here (although Nice might do it properly).

So GEXL, which aims to provide both data and their operations as data and visitors respectively, runs smack into the expression problem. Fortunately, Mads Torgersen's paper, The Expression Problem Revisited, provides 4 new solutions by exploiting the generics of C# and Java. The first 3 solutions are still biased towards data-centric or operation-centric extensions, but the last solution provides full extensibility in both directions in ordinary C# with generics. GEXL is saved! There's just one catch: it's not type safe since it requires 2 casts. Fortunately, these casts are encapsulated in the core data and visitor classes, which are parameterized by the types they must check and cast to. The developer extending the core will also need to be a bit careful to make sure he's parameterizing the visitor and expression types at the same level as the extension he's writing.

These core classes are now implemented in GEXL, together with two interpreters and pretty printers written for the untyped lambda calculus, one implemented as data-centric extension of the core library, and the other as an operation-centric extension; the two total about 300 lines of excessively verbose C#, including comments. The paper provides a mixed example of data and operation extensions if you're interested in extending in both directions.

There is a disadvantage of this solution however: visitors can no longer return the results of their computation, due to a type dependency between the Data and Op; if we were to parameterize IVisitor with a return type, IVisitor<R>, as I described earlier, Op must also be parameterized by R, which forces Data to also be parameterized by R since Data's type is constrained by IVisitor.

This means that the entire expression tree will be parameterized by the return type of the operation that we wish to perform, R, which means it can only be traversed by operations that return R! No more typechecking returning bools and tree transforms returning new trees on the same expression tree. Type safety has backed us into a corner; this is in fact a GADT problem, and as covered previously, C# generics fall just short of GADTs. If C# were to be augmented by full equational constraints, instead of the limited form they have now, this problem would be solvable, since we could simply parameterize the methods themselves with the appropriate constraints.

In fact, full equational constraints aren't needed; there are only two limitations of C#'s type system impeding this solution:

  1. Limitations of type constraints to derivations: a constraint placed on a generic parameter must be a non-sealed class or an interface, so a constraint limiting a type parameter to a string is not possible. This should be relaxed so that the type constraint is a full subtype relation. The current design is simply a consequence of #2.
  2. Lack of type constraint refinements: a constraint cannot be refined to a more specific type when subtyping. For instance, the declaration "class Foo where T : S", cannot be refined to "class Bar : Foo where T : U" where U is a subtype of S.

As it stands, the least offensive solution is for each visitor to retain its computed value as a private field.

So now that GEXL lives, perhaps we'll also see the eventual revival of Orc.NET. In the meantime, I'll try porting the interpreter for my language to GEXL and see how painful it turns out to be. :-)

P.S. There's a further interesting post in the above LTU thread: it provides a solution to the expression problem using just sum and recursive types; it requires significant boilerplate, but the extensions are still possible, which is impressive.

Thursday, May 3, 2007

What's wrong with .NET

I work with .NET everyday, and it's a decent virtual machine from the developer's perspective, and a slight overall improvement over Java in a number of respects.

But for a language implementor, .NET continues a tradition of mistakes that began with Java, which inhibit it from achieving its much touted goal as the Common Language Runtime (CLR). These issues are divided into two categories: blatant mistakes, and nice to haves. The former requires no cutting edge features or research to implement, and the CLR should have had them from the get-go. The latter might have required some research to get right (like they did with generics), but which are ultimately required for a truly secure and flexible virtual machine.

Blatant Mistakes

  1. Insufficiently powerful primitives for execution and control flow. The most glaring of these omissions are first class function pointers and verifiable indirect calls. On .NET a function pointer is reified as an IntPtr, which is an opaque struct representing a native pointer reified as a native integer. calli is the VM instruction that indirectly invokes a function via such an IntPtr. Because an IntPtr is opaque, the VM can't really know whether it's pointing to a function, or a data blob held outside the VM heap, and calli is consequently an unverifiable instruction. calli is thus available only under full trust since the VM will crash if it calli's into a data blob. Many truly interesting languages would require calli for efficient execution, and the full trust requirement means that partial trust environments, such as cheap web hosts, won't be able to run assemblies generated from these languages. Microsoft consulted with a number of language researchers in the design for .NET and they responded enthusiastically with a "verifiable calli" proposal for just the above reasons. Unfortunately, MS ignored them.
  2. Stack walking was a huge mistake inherited from Java. Stack walking prevents a number of useful optimizations, such as efficient tail calls, and alternative security architectures and execution models from being implemented. As it is, we're stuck with the horrible Code Access Security (CAS), when a far simpler model was already built into the VM. MS's new Silverlight supposedly introduces a new security framework based solely on named "trusted" assemblies, so perhaps we can expect this to be partly fixed. Efficient tail calls in particular are essential for functional languages.
  3. Continuations. I was never sold on continuations until I started considering scheduling and scalability issues, and subsequently read a number of papers on concurrency; in summary, continuations are essential for scalable, schedulable execution subject to configurable user-level policies. Actually, continuations are not strictly necessary, as an expensive global CPS-transform of a program coupled with efficient tail calls can achieve the same effect; however, CPS has a number of performance side-effects that may not be acceptable for certain languages. Thus, continuations or something equally expressive, like delimited continuations, are essential for a true CLR. Coupled with efficient tail calls, we can easily achieve highly scalability architectures with low memory use, as SEDA and Erlang have shown. Many people believe that threads subsume continuations, and some believe the contrary, but in fact, both abstractions are necessary; a thread is a "virtual processor", and a continuation is the processor context. If .NET had continuations, we could build operating system-like execution models that, in theory, can achieve optimal CPU utilization and parallelism with minimal resource use via non-blocking I/O, dataflow variables, and a number of threads equal to the number of real CPUs on the machine. This is achievable only with continuations or the equivalent as a global CPS transform.
  4. Poor security architecture. CAS is a hack bolted on to a VM that had it omitted a single feature and tweaked another feature, wouldn't have needed any security architecture at all. If static class fields were not available, then the native execution model of the .NET VM reduces to capability-based security. It's possible to include static fields if the fields have certain properties (transitive immutability for instance), but conceptually, just imagine a VM whereby the only way to obtain access to an object is being given access to it such as in a method call. There is a further condition to satisfy capability security that has consequences for P/Invoke: all functionality accessible via P/Invoke must be reified as a reference (ie. an object). This means that static, globally accessible operations such as "FileInfo File.Open(string name)" would not be permitted since they subvert the security model by calling out to the OS to convert a string into authority to a File; instead, at the very least File.Open would be reified as a FileOpener object, and this object is "closely held" by trusted code. In effect, this forces all P/Invoke functions to be minimally capability secure. There is also an alternative approach to security with which we can have our fully mutable static fields, and have our isolation and security too; in fact, it's a widely used abstraction invented decades ago which provides full fault isolation and a number of other useful properties: lightweight language process; like OS processes they provide full isolation, but it's a process that exists only in the VM; in effect, it's similar to processes as found in the pi-calculus. Static fields are thus scoped to the lightweight process. Processes and their properties are explained further under "nice to haves".

Nice to Haves

  1. Lightweight process model. Yes, I'm aware that .NET has the AppDomain which is sort of like a lightweight process, but it's isolation properties are inadequate. Microsoft's Singularity operating system takes the base AppDomain abstraction, and extends it into a full "software isolated process" (SIP). Processes should be single-threaded, and should additionally permit fault handling and resumption via Keepers. Like in OSs, portions of a process can potentially be paged in and out, serialized, persisted, or otherwise manipulated, all transparently. Call them Vats as in E, processes as in Erlang, or SIPs as in Singularity, but the process abstraction is critical for full VM isolation. AppDomains are fairly good start however, and I'm researching ways to exploit AppDomains to enhance .NET's security.
  2. Resource accounting. Resource accounting includes managing memory, scheduling concurrent execution, etc. Without properly reifying these abstractions as explicit objects subject to configurable policies, the VM is vulnerable to denial of service (DoS) attacks from malicious code. As it stands, .NET cannot load and run potentially malicious code in the same VM as benign code. For instance, it should be possible to set a quota for memory use, and to schedule and interleave execution so that code cannot exhaust VM memory and "steal the CPU". However, you can imagine that passing around quotas in code would be quite unwieldy; fortunately, this requirement interacts nicely with another abstraction we already need: processes. Thus, the process is also the unit of resource accounting, and spawning a new process starts a new heap, potentially with a quota. Processes can then be scheduled via a VM-level scheduler, and each VM can also schedule its own execution as it sees fit (threaded, event-loops, etc.). Interprocess communication (IPC) can be managed with an exchange heap as in Singularity, or via copying as is done in most microkernels.
  3. Concurrency model. This has more of an effect on the class library than the base VM design, but plan interference is a huge source of bugs in concurrent code, and it should be tackled by every language. VM processes provide a minimal, tried and tested concurrency and isolation abstraction, but within a process there are a wide range of options available, including event-loop concurrency, or language enforced concurrency models. Controlled sharing of state between processes needs to be tackled, such as the exchange heap in Singularity, though my current inclination is to prefer copying for safety reasons.

As you can see, .NET doesn't need much to turn it into a flexible, industrial strength, high security VM. I'm going to follow this post with a detailed description of what the ideal minimal VM should have, and I will even describe how I believe it can be implemented safely and efficiently.

Saturday, April 7, 2007

General Pattern Matching with Visitors

My previous definition of IVisitor hard-coded the return value as a type of Val, but one can generalize a visitor to any kind pattern-matching function by adding a generic type parameter to the interface:
//the pattern matching visitor
public interface IVisitor<T>
//return type is now parameterized
T App(Exp e0, Exp e1);

This of course requires a modification to the class variants as well:
//the type representing an expression application
public sealed class App : Exp
Exp e0;
Exp e1;

public App(Exp e0, Exp e1)
this.e0 = e0;
this.e1 = e1;

//add a generic constraint to the Visit method, so the client can
//specify the return type

public override T Visit<T>(IVisitor<T> v)
return v.App(e0, e1);

So instead of IVisitor being hard-coded as an evaluator, I can now write an IVisitor that performs transformations on the expression tree (such as rewriting optimizations), or a visitor that implements some arbitrary predicate over the expression tree (such as type-checking, type inference, etc.).

Like pattern matching functions over ordinary parameterized algebraic data types, the return value of the visitor must be T or a subtype of T.

In fact, this visitor pattern, minus the decomposition as in pattern matching, is featured in Generalized Algebraic Data Types and Object-Oriented Programming; see also the LTU discussion.

According to the paper, generics as in C# are almost capable of encoding GADTs. Unfortunately, they lack some flexibility in declaring type dependencies between class type parameters, and method type parameters. The paper provides some very interesting C# examples of type-safe type checking, evaluation, phantom types, and representation types. This generics black magic warrants further study for the advanced C# student.

Tuesday, April 3, 2007

Visitor pattern considered harmful (unless functionalized)

To kick off my first post, I thought I'd cover a post on another blog that I found interesting.

Basically, the gist of the post is that object oriented visitor pattern does not lend itself to the problem of walking trees, and general symbolic manipulation, despite that being its claim to fame. In general, algebraic data types and pattern matching are much simpler, and more concise.

I agree, however I think that post goes a bit too far in saying that visitors simply can't do this elegantly, because in fact there's a straightforward translation of a pattern matching function into a visitor, as long as you keep your objects "functional", ie. treat them as variants.

For instance, let's say we are constructing an evaluator:

//the Expression type
public abstract class Exp
public abstract Val Visit(IVisitor v);

//the type of Values
public abstract class Val : Exp

//the type representing an application
public sealed class App : Exp
Exp e0;
Exp e1;

public App(Exp e0, Exp e1)
this.e0 = e0;
this.e1 = e1;

public override Val Visit(IVisitor v)
return v.App(e0, e1);

//the pattern matching visitor
public interface IVisitor
Val App(Exp e0, Exp e1);

This is slightly different from the typical visitor pattern you see in books, where the object itself would be passed in to IVisitor.App(). In fact, this looks almost exactly like a pattern matching function in a language like OCaml:

type Exp = App of Exp * Exp | Val   (* | ... *)

let rec eval exp =
match exp with
App e0 e1 -> (* ... evaluate the application *);;

Thus, properly designed, a visitor is simply a pattern matching function. By convention, we use the class name as the visitor's method, so it matches with the convention in functional language's (ie. the constructor name). The visitor method's parameters are the object's private fields, just like a variant. I'm using this approach for an interpreter I'm implementing and it works quite well (unfortunately no source is publicly available at the moment).

In terms of efficiency, the pattern matching function is probably faster. After all, it's simply a tag check plus a branch, whereas the visitor must perform a double dispatch (two method lookups incurring two indirections and two indirect branches).

Clearly, the visitor is also far more verbose, as you must create the "variant classes", manually add a "Visit" method to each, and since there is no "default" case as in pattern matching, like the "default" case in a switch statement, you have to exhaustively handle each variant explicitly in each visitor, even if the case isn't relevant. You can mitigate the latter problem by implementing a "Default" abstract base class with dummy implementations on all cases, and then override only the cases you are interested in:

public abstract class Default : IVisitor
//overridden in my Interpreter visitor
public virtual Val App(Exp e0, Exp e1)
throw new NotSupportedException();

As a potentially interesting extension, the variant classes may not even have to be in the same inheritance hierarchy, they only have to implement a marker interface with the appropriate Visit() method. This may increase extensibility along similar lines to polymorphic variants; only the visitor has to be kept up to date with all the relevant cases, and it can evolve separately from the core.


So if you're doing symbolic manipulation, and are forced to operate in a OO language like C# above, you can still approximate the elegance of algebraic data types and pattern matching, it just takes a little more work on your part. Pretty much par for the course in the OO world. :-)