Skip to main content

Object/Relational Mapping and Factories

My day job is has stuck me with C#, but I try to make the best of it, as evidenced by my FP# and Sasa C# libraries.

One thing that still gets in my way more than it should is O/R mapping. No other mapper I've come across encourages a true object-oriented application structure. Granted, I've only really used NHibernate, and I had built my own mapper before that was even available, but I've read up quite a bit on the the other mappers.

By true OO structure, I mean that all application objects are only constructed from other application objects, which doesn't involve dependencies on environment-specific code (ie. if you're running under ASP.NET, Windows forms, Swing, etc.). A pure structure encourages a proper separation between core application code, and display and controller code, which allows more flexible application evolution.

Instead, controller logic often manually constructs application objects, passing in default arguments to properly initialize the required fields. This means constructor and initialization code must be duplicated when running in another environment, or tedious refactoring is needed when changing the constructor interface. Further, the defaults are hardcoded in the code, which means changes in defaults require an application upgrade.

Instead, O/R mappers should promote a factory pattern for constructing application objects. Factories themselves are constructed when the application is initialized, and are henceforth singletons within a given application instance. O/R mappers don't support or encourage factories or singletons in this manner however, as they always map a key/identifier, to an object instance. Factories are slightly different as they are generally singletons.

For example, let's assume we have a simple Product class:
public abstract class Product
{
int productId;
decimal price;
protected Product()
{
}
}
Now we have here a public constructor which requires a Quote object to initialize the base Product object. You can't sell 'abstract' products, so we need a concrete product, like a Table:
public class Table : Product
{
int length;
int width;
public Table(int length, int width) : base()
{
this.length = length;
this.width = width;
}
}
Of course, a Table with dimensions of 0'x0' is invalid, so we need to ensure that a Table is initialized with a proper length and width. We can pass in a pair of default dimensions when constructing a Table instance in a controller, but chances are the default values will be the same everytime you construct an instance of Table. So why duplicate all that code?

For instance, suppose we have another class "DiningSet" which consists of a Table and a set of Chairs. Do we call the Table constructor with the same default values within the DiningSet constructor?

Of course, many of you might now be thinking, "just create an empty constructor which invokes the parameterized constructor with the default values; done". All well and good because your language likely supports the int type very well. Now suppose that constructor needs an object that cannot be just constructed at will from within application code, such as an existing object in the database.

Enter factories:
public interface IProductFactory
{
Product Make();
}
public sealed class TableFactory : IProductFactory
{
int defaultLength;
int defaultWidth;
public Product Make()
{
return new Table(defaultLength, defaultWidth);
}
}
The IProductFactory abstract all factories which construct products. Any parameters that the base Product class accepts in its constructor are passed in to the Make() method, as this is shared across all Product Factories. TableFactory is mapped to a table with a single record containing the default length and width values. If the constructor requires an existing database object, this can be referenced via a foreign key constraint, and the O/R mapper will load the object reference and its dependencies for you.

Since factories are generally singletons, it would be nice if O/R mappers provided special loading functions:
public interface ISession
{
T Load<T>(object id);
T Singleton<T>();
}
This models and O/R mapper session interface after the one in NHibernate. Note that a special Singleton() method simply loads the singleton of the given type without needing an object identifier.

Our controller code is thus reduced to:
...
Product table = session.Singleton<TableFactory>().Make();
...
Which encapsulates all the constructor details in application objects, does not hardcode any default values since they live in the database and can be upgraded on the fly, isolates refactorings which alter the Table constructor interface to the TableFactory alone, and simplifies controller code as we don't need to load any objects. This is a "pure" object-oriented design, in that the application can almost bootstrap itself, instead of relying on its environment to properly endow it with "god-given" defaults.

This approach also enables another useful application pattern which I may describe in a future post.

[Edit: I've just realized that the above is misleading in some parts, so I'll amend soon. Singletons aren't needed as much as I suggest above.]

Comments

John Zabroski said…
Saw your edit just now... not sure when you added it, but:

If the Factory doesn't retain any state, then every method it exposes should allow for combinator composition so that the object it materializes can be composed in terms of its facets.

In this way speaking about Singleton's doesn't necessarily make sense; TableFactory will only need to have one instance when it doesn't retain state, because it is sealed. The combinators don't need to retain state, either.

If state is retained, it is Monostate, due to uniquing object references based on object identities. It is not a good use case for Singleton, either.

If anything the runtime should manage this by realizing all the methods are static and only exist to provide combinators for materializing a concrete object. If it can see that the class is sealed and the methods are static, then it can automatically apply a Singleton, possibly through a Flyweight runtime VM service.
Sandro Magi said…
I believe I added that edit over a year ago. Can't even remember specifically what misleading parts I was referring to.

Regarding your suggestions, I agree it would be ideal if the language or its runtime could optimize these cases automatically. No sense making the user specify information that is already semantically available.

I suppose factories in the kind of data-driven programs I was thinking of could have internal mutable state modified during the construction of an object, though I haven't come across such a situation myself.

Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre...

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD) , and decided to take a crack at distilling its essence in C#. Note that automatic differentiation (AD) is different than numerical differentiation . Math.NET already provides excellent support for numerical differentiation . C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries . Neither of these are suitable for learning how AD works. So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique. What is Automatic Differentiation? Simply put, automatic differentiation is a technique for calcu...

Building a Query DSL in C#

I recently built a REST API prototype where one of the endpoints accepted a string representing a filter to apply to a set of results. For instance, for entities with named properties "Foo" and "Bar", a string like "(Foo = 'some string') or (Bar > 99)" would filter out the results where either Bar is less than or equal to 99, or Foo is not "some string". This would translate pretty straightforwardly into a SQL query, but as a masochist I was set on using Google Datastore as the backend, which unfortunately has a limited filtering API : It does not support disjunctions, ie. "OR" clauses. It does not support filtering using inequalities on more than one property. It does not support a not-equal operation. So in this post, I will describe the design which achieves the following goals: A backend-agnostic querying API supporting arbitrary clauses, conjunctions ("AND"), and disjunctions ("OR"). Implemen...