Skip to main content


Showing posts from November, 2008

Compact, Declarative Serialization

A few posts back , I hinted at using parameterization as an alternative to metadata. I've just written a serialization interface using this technique, and a binary serializer/deserializer to demonstrate the inherent tradeoffs. You can inspect the pair of interfaces required, and I implemented a binary serializer/deserializer pair as an example. ICompactSerializable is implemented for each serializable object. It's essentially a declarative method, which describes the sequential, internal structure of the object. It's simple and fast, since it provides native speed access to an object's fields without reflection, and no need for metadata. Of course, the obvious downside is that clients must describe the internal structure themselves via ICompactSerializer , and refactoring must be careful about reordering the sequence of calls. The upshot is that serialization and deserialization is insanely fast as compared to ordinary reflection-driven serialization, the binary is f

The Chicken and the Egg - Redux

There is still considerable skepticism regarding my conclusion that the chicken comes first . Many of the objections are simply due to different interpretations of the question, interpretations which I consider unfaithful to the original purpose of the chicken-egg dilemma . Fundamentally, this question is supposed to represent a causality dilemma. When there is a causal dependency between any two objects, A and B, we can ask ourselves, "which came first, A or B?" An object C could have caused B, and thus triggered the recursive relation C→B→A→B→A... Of course, it could just as easily have been A that was caused first. To properly answer this question, we must reduce the abstract objects A and B to concrete objects and apply our scientific knowledge to ground the recursion. Using chickens and eggs as our objects, we have to precisely define what chickens and what eggs to consider. The question "which came first, chickens or any type of egg?" is not a causal dilemma a

Embedded Stack Language for .NET

Parametric polymorphism is a powerful tool for constructing and composing safe programs. To demonstrate this, I've constructed a tiny embedded stack language in C#, and I exploited C# generics, aka parametric polymorphism, to ensure that any embedded programs are type-safe, and thus, "can't go wrong". Moreover, this language is jitted, since I use ILGenerator of System.Reflection.Emit, and the complexity of doing this was no higher than creating an interpreter. This is mainly because the CLR is itself a stack-based VM, and so jitting a stack-based program to a stack-based IL is fairly natural. Here is the type-safe representation of the stack: public struct Stack<R, T> : IDisposable { internal ILGenerator gen; public Stack(ILGenerator gen) { this.gen = gen; } public void Dispose() { this.Return(); } } This type encapsulates the IL output stream and uses two phantom type variables to encode the values at the top of the

Reflection, Attributes and Parameterization

I used to be a big fan of reflection, and C#'s attributes also looked like a significant enhancement. Attributes provide a declarative way to attach metadata to fields, methods, and classes, and this metadata is often used during reflection. The more I learned about functional programming, type systems, and so on, the more I came to realize that reflection isn't all it's cracked up to be. Consider .NET serialization. You can annotate fields you don't want serialized with the attribute [field:NonSerialized]. However, metadata is just data, and every usage of attributes can be replaced with a pair of interfaces. Using [field:NonSerialized] as an example, we can translate this class: class Foo { [field:NonSerialized] object bar; } Into one like this: // these two interfaces take the place of a NonSerializableAttribute declaration interface INonSerialized { void Field<T>(ref T field); } interface IUnserializableMembers { void Unserializable(INonSerialized s);

Garbage Collection Representations Continued

In a previous post , I discussed representations used for polymorphic parameters. These special data representations are a contract between the language and the runtime which enables the runtime to traverse data structures to collect garbage. 64-Bit Polymorphic Representation I had neglected to mention this interesting variant of tagged representations that I had come across at one point, but which I believe is not in widespread use. This representation uses full 64-bit floating point representations for all polymorphic values , and encodes non-floating point data in the unused bits of NaN values. IEEE 754 floating point numbers reserve 12 of the 64 bits for flagging NaN values, so the remaining bits are free to use for encoding integers, pointers, and so on. I haven't found very much empirical data on this approach, but it looks promising. The polymorphic representation is doubled in size, but the structure is unboxed. Thus, the polymorphic function call overhead may be slightly h