Tuesday, November 4, 2008

Reflection, Attributes and Parameterization

I used to be a big fan of reflection, and C#'s attributes also looked like a significant enhancement. Attributes provide a declarative way to attach metadata to fields, methods, and classes, and this metadata is often used during reflection.

The more I learned about functional programming, type systems, and so on, the more I came to realize that reflection isn't all it's cracked up to be. Consider .NET serialization. You can annotate fields you don't want serialized with the attribute [field:NonSerialized].

However, metadata is just data, and every usage of attributes can be replaced with a pair of interfaces. Using [field:NonSerialized] as an example, we can translate this class:
class Foo {
[field:NonSerialized]
object bar;
}

Into one like this:
// these two interfaces take the place of a NonSerializableAttribute declaration
interface INonSerialized {
void Field<T>(ref T field);
}
interface IUnserializableMembers {
void Unserializable(INonSerialized s);
}
class Foo : IUnserializableMembers {
object bar;
void Unserializabe(INonSerializable s) {
s.Field(ref bar);
}
}

Essentially, we are replacing reflection and metadata with parameterization, which is always safer. This structure is also much more efficient than reflection and attribute checking.

Consider another example, the pinnacle of reflection, O/R mapping. Mapping declarations are often specified in separate files and even in a whole other language (often XML or attributes), which means they don't benefit from static typing. However, using the translation from above, we can obtain the following strongly type-checked mapping specification:
// specifies the relationships of objects fields to table fields
interface IRelation
{
// 'name' specifies the field name in the table
void Key<T>(ref T id, string name);
void Field<T>(ref T value, string name);
void Foreign<T>(ref T fk, string name);
void ForeignInverse<T>(ref T fk, string foreignName);
void List<T>(ref IList<T> list);
void List<T>(ref IList<T> list, string orderBy);
void Map<K, T>(ref IDictionary<K, T> dict);
}
// declares an object as having a mapping to an underlying table
interface IPersistent
{
void Map(IRelation f);
}
// how to use the above two interfaces
class Bar: IPersistent
{
int id;
string foo;
public void Map(IRelation f)
{
f.Key(ref id, "Id");
f.Field(ref foo, "Foo");
}
public string Foo
{
get { return foo; }
}
}

There are in general two implementors of IRelation: hydration, when the object is loaded from the database and the object's fields are populated, and write-back, when the object is being written back to the database. The IRelation interface is general enough to support both use-cases because IRelation accepts references to the object's fields.

This specification of the mappings is more concise than XML mappings, and is strongly type-checked at compile-time. The disadvantage is obviously that the domain objects are exposed to mapping, but this would be the case with attributes anyway.

Using XML allows one to cleanly separate mappings from domain objects, but I'm coming to believe that this isn't necessarily a good thing. I think it's more important to ensure that the mapping specification is concise and declarative.

Ultimately, any use of attributes in C# for reflection purposes can be replaced by the use of a pair of interfaces without losing the declarative benefits.

3 comments:

Emir Uner said...

Hi, can you give a code example for a hypotethical code for using
the first use case -- a serializer using your proposal instead of
reflection+attributes.

An example reflection+attribute using version may be:

void serialize(Object o) {
foreach(var field in o.GetType().GetFields()) {
if(field.Attributes.NonSerialized) {
// Do not serialize
} else {
// Serialize
}
}
}

Sandro Magi said...

There are two ways to provide serialize via interfaces: via positive or negative information.

The NonSerialized attribute is negative information, and I was confusing it with the implementation for positive information, so here's the actual interface:

interface INonSerialized {
void Field(string fieldName);
}

A serializer would still use reflection, or some other means to access the object internals and would maintain the set of fields it should skip:

class Serializer : INonSerialized {
ISet skip = new Set();// any set semantics will do

void Serialize(object obj, Stream s) {
if (obj is IUnserializableMembers) {
(obj as IUnserializableMembers).Unserializable(this);
}
foreach(var field in o.GetType().GetFields()) {
if (skip.Contains(field.Name)) {
// do not serialize
} else { ... }
}
}

void Field(string fieldName) {
skip.Add(fieldName);
}
}

I don't like this implementation much, and I prefer the converse:

interface ISerializer {
void Version(string version);
void Field<T>(ref T value);
void Deleted<T>(string version);
}
interface ISerializable {
void Serialize(ISerializer s);
}

This is a minimal serialization interface that I think can replace 90% of the standard serializer, including versioning, and yet produce binaries that are more compact. Example usage:

class Test : ISerializable {
int i;
string name;
...
public void Serialize(ISerializer s) {
s.Version("1.0.0");
s.Field(ref i);
s.Field(ref name);
}
}

ISerializable implementors must ensure that they do not reorder Field calls. New fields must go at the end. Any Field calls to be deleted should be replaced with a Deleted call, along with the version number at which the field was deleted.

Pros:
1. Runs at full speed, ie. no reflection penalty.
2. Statically typed.

Cons:
1. Client must implement the interface (though compilers can easily derive the necessary function).
2. Slightly higher implementation burden.

It's an interesting alternative design at the very least.

Sandro Magi said...

Another Pro of the alternate serialization design is the clearer versioning semantics; .NETs versioning is rather complicated by comparison.