Monday, May 17, 2010

Asymmetries in CIL

Most abstractions of interest have a natural dual, for instance, IEnumerable and IObservable, induction and co-induction, algebra and co-algebra, objects and algebraic data types, message-passing and pattern matching, etc.

Programs are more concise and simpler when using the proper abstraction, be that some abstraction X or its dual. For instance, reactive programs written using the pull-based processing semantics of IEnumerable are far more unwieldy than those using the natural push-based semantics of IObsevable. As a further example, symbolic problems are more naturally expressed via pattern matching than via message-passing.

This implies that any system is most useful when we ensure that every abstraction provided is accompanied by its dual. This also applies to virtual machine instruction sets like CIL, as I recently discovered while refining my safe reflection abstraction for the CLR.

A CIL instruction stream embeds some necessary metadata required for the CLR's correct operation. Usually this metadata is type information of some sort, such as the type of a local or the name and/or signature of a target method. For instance, here's a Hello World! example:
ldstr "Hello World!"
call void [mscorlib]System.Console::WriteLine(string)

Unfortunately, the CIL instruction set suffers from some asymmetries which make some types of programming difficult.

For example, the ldtoken instruction takes an embedded metadata token and pushes the corresponding runtime type handle onto the evaluation stack; this is the instruction executed when using the typeof() operator in C#.

While this operation is useful, we sometimes want its dual, which is to say, we want the metadata used in a subsequent instruction to depend on the object or type handle at the top of the evaluation stack. A related operation is supported on the CLR, namely virtual dispatch, which depends on the concrete type, but dispatch is not general enough to support all of these scenarios because the vtable is immutable.

Consider a scenario where you have an untyped object, like a reference to System.Object "obj", and you want to call into some generic code, like a method Foo<T>(T value), but pass the concrete type of obj for T, instead of System.Object. Currently, you must go through a laborious process where you call GetType() on the object to obtain it's type handle, then obtain the method handle via reflection or some clever CIL hackery, then call MethodInfo.MakeGenericMethod in order to instantiate the type argument on Foo<T>, and finally, you must perform a dynamic invocation via reflection or allocate a custom delegate of type Action<T> and perform a virtual call, even though the call is statically known.

Each step of this process is expensive, and it makes typeful programming on the CLR painful when working on the boundary between typed and untyped code. Many reflection problems, like serialization, become simpler once we're dealing with fully typeful code.

Consider a dual instruction to ldtoken called "bind" which takes a type handle obtained via GetType() and then binds the resulting type handle into the executing instruction stream for the subsequent generic call to Foo<T>. This instruction could be easily and efficiently supported by any VM. Some restrictions are clearly needed for this instruction to remain verifiable, namely that the type constraints required by the target method are a subset of the type constraints of the statically known type, but the verifier already performs this analysis, and all of the information needed is available at the call site.

Fully polymorphic functions like Foo<T> trivially satisfy this condition since it has no constraints whatsoever. Serialize<T>/Deserialize<T> operations are duals, and in fact exhibit exactly the same sort of fully polymorphic type signature as Foo<T>.

There are many more programs that exhibit this structure, but they are unnecessarily difficult to write due to these asymmetries in CIL. This unfortunately requires developers to write a lot of ad-hoc code to get the results they want, and more code results in more bugs, more time, and more headaches.


Rodrigo Kumpera said...

This work nice in theory, but the devil are in the details.

To make it happen one has to take care of a lot of non trivial things like:

Valuetype semantics, the object must be unboxed before passed to the called method.

Constraints, those require a non trivial amount of checks at invocation site.

Interface calls, those are really tricky if you have variance and/or multiple implementations.

What you're asking is basically something like the invokedynamic instruction of java7. Which is one order of magnitude easier to implement on top of Java's trivial type system.

There's no reason you can't just use something like DLR's inline caching. Which is high performing and could be even more with some small help from the runtime.

Sandro Magi said...

The absolute minimum required is a polymorphic inline cache dispatched on the runtime Type handle. I don't think that's too difficult at all.

I don't see how constraints are a problem. The method call is already verified to succeed with whatever supertype is statically known at the call site. I'm just talking about binding the type variable T to the concrete type for the subsequent call.

invokedynamic would do it, but is probably overkill. There are many possible solutions though, which ultimately all boil down to a "typecase", aka type-indexed functions. There are many compilation techniques for type-indexed functions, and in fact the CLR already has one form with its type-passing. The problem is that type-passing is not pervasive, it's selective, so the loss of type information means this instruction is required.

Allowing vtable extensions at runtime is another way, though sounds the most difficult to me. Then again, if you have a PIC, this approach becomes feasible too.

Sandro Magi said...

Actually, there's an even simpler solution to implement this instruction. Consider a priviledged method inserted by the runtime at index 0, which essentially performs a double dispatch on a method pointer. Code after the special instruction is compiled as a separate method which is dispatched to via this special method.

This can be implemented entirely with the existing CLR machnery. There are various optimizations to make this even more efficient.

Qwertie said...

I like the mathematical way you think, Sandro. I've been reading your blog for a couple of hours now. I like the .NET framework a lot, but it does have a lot of annoyances too. I've have come to the conclusion that the industry needs something that resembles the CLR but operates at a lower level, so that people who are not Microsoft* can implement new features or change the way existing features work... and so that the framework is not limited to "high-level" programs but could be used in very resource-limited environments and even OS kernels. I would gladly work on such a project with reduced pay, if somebody would be willing to pay me :)

A key feature of my idea would be a preference for compile-time metaprogramming over run-time metaprogramming. Because it seems like 75% of the things we use reflection and dynamic methods for could be done at compile-time instead, lowering run-time cost to zero. The remaining 25% could be accelerated too, by offering better-designed reflection APIs.

* Sorry, I don't mean to imply corporations are people.

Qwertie said...

P.S. What do you mean by "index 0"? Anyway, I have heard that the CLR generally uses the same x86 code for a generic method when the type parameter is any reference type; therefore if the generic method happens to be constrained with "where T:class", the CLR could actually call that one method statically... zero overhead! Am I wrong? It sounds too easy.

Sandro Magi said...

Hey Qwertie, some of what you're after is a goal of more modern systems programming languages. See for instance Clay, Deca, Habit and Rust. They're still works in progress, but promising.

For a long established safe systems language, we have Ada. It even supports some limited compile-time metaprogramming via Ada's ASIS.

Regarding "index 0", all methods are dispatched via a unique index in a table of function pointers (vtable). Rodrigo was implying that the pattern I was after was difficult to implement on the CLR. I suggested that it could be easily implemented if the runtime reserved index 0 in every vtable to implement a double dispatch.

Qwertie said...

What I'm really after is a system that makes "everything" possible with high performance: every programming environment from OS kernels, to microcontrollers, to sandboxed code in a web browser (like Silverlight), to AAA games, to scientific computing in a machine with 64 cores and gigabytes of RAM.

Imagine a system that could theoretically support nearly all programming scenarios, and still allow garbage collection, reflection, easy remoting, and run-time code generation; and which also supports multiple programming languages interoperably.

I think what's needed is a set of standards for language interoperability, reflection, JIT compilation, GC and standard libraries, that allows the FOSS community to develop the pieces independently, and does not mandate every feature for every situation (e.g. one should be able to substitute GC for simple reference counting in environments where GC is infeasible or unwanted, and reflection support could vary by language and platform.)

I wish, basically, that .NET could be changed from a cathedral to a bazaar [Eric Raymond's concept] in order to stimulate development in all directions, without losing high-level interoperability between programming languages. So the key features would be cross-language interfaces, cross-language reflection, a mechanism for languages to opt-in to a compacting GC, and a basic standard library that provides standard ways to share data efficiently across languages. An intermediate language and JIT would also be needed for running code in a sandboxed environment, or simply to allow multi-platform executables.

The cross-language part is key, since there are so many new languages being created that can't get traction because they can't interoperate with other languages beyond the most basic level (namely, by exporting and importing C-style functions). I mean, that's why I don't even consider using Haskell. I've written all this great C# code, and I just can't mix it with any of the non-.NET languages.

I'm convinced my idea is possible in principle, but maybe it'll never actually happen.

Sandro Magi said...

To support everything, you need some means to check safety. Proof carrying code can accomplish what you want, but it still needs some research.

Qwertie said...

Correction: I assumed the combination "low resource environment" + "sandboxed code" + "high speed" was not possible. Without the high-speed requirement you'd simply sandbox code using a small interpreter.

Qwertie said...

Weird, my previous comment did not appear. I was just basically saying that this PCC idea is awesome... although it's not strictly necessary for my idea to work. The well-known JIT approach allows sandboxing and fairly high performance (although not optimal), and plain-old machine code can be used when sandboxing (proven safety) is not required. Without knowing much about it, it sounds like PCC lets you "have your cake and eat it too"--although personally, I am still inclined to prefer the JIT approach most of the time, since it allows cross-platform executables.

Sandro Magi said...

Yes, a sandboxing approach would work as well. See Google's Native Client/NaCl work. It's sandboxing at the level of machine code, though you still sacrifice portability. So you really want high performance, low resource use, portability.

That would be difficult to achieve all in one package, since JITs are typically rather memory hungry. At best, I think something like the CLR's ahead-of-time compilation used at application install time is the best bet; you can do this with LLVM as well.