Thursday, September 23, 2010

High-level constructs for low-level C: exception handling, RAII, sum types and pattern matching

There are numerous high-level abstractions available in other languages that simply make programming easier and less error prone. For instance, automatic memory management, pattern matching, exceptions, higher order functions, and so on. Each of these features enable the developer to reason about program behaviour at a higher level, and factor out common behaviour into separate but composable units.

For fun, I've create a few small macro headers that enable some of these patterns in pure C. If anyone sees any portability issues, please let me know!

libex: Exception Handling and RAII


RAII in C is definitely possible via a well-known pattern used everywhere in the Linux kernel. It's a great way to organize code, but the program logic and finalization and error logic are not syntactically apparent. You have to interpret the surrounding context to identify the error conditions, and when and how finalization is triggered.

To address this, I encapsulated this RAII pattern in a macro library called libex, with extensions to support arbitrary local exceptions, and a small set of pre-defined exception types. Currently, this just consists of more readable versions of the error codes in errno.h.

No setjmp/longjmp is used, and libex provides little beyond case checking and finalization, because I wanted to provide a zero overhead exception handling and RAII that can supplant all uses of the undecorated pattern. Replacing all instances of the RAII pattern in Linux with these macro calls would incur little to no additional overhead, as it compiles down to a small number of direct branches.

There are also some convenience macros for performing common checks, like MAYBE which checks for NULL, ERROR which checks for a non-zero value, etc.

Example:
exc_type test(int i) {
THROWS(EUnrecoverable)
TRY(char *foo) {
MAYBE(foo = (char*)malloc(123), errno);
} IN {
// ... no exception was raised, so compute something with foo
// if EUnrecoverable thrown, it will propagate to caller
if (some_condition()) THROW(EUnrecoverable)
} HANDLE CATCH (EOutOfMemory) {
// ... handle error for foo
} CATCHANY {
// ... other errors?
} FINALLY {
// ... finalize any state that has already been allocated
}
DONE
}

There are a few small restrictions required for the full exception semantics to work, so please see the main page of libex for further details.

libsum: Pattern matching and sum types, aka disjoint/tagged/discriminated unions, aka variants


Functional languages have long enjoyed the succinct and natural construction and deconstruction of data structures via sum types and pattern matching. Now you can have some of that power via a few simple macros:
/* declare a sum type and its constructor tags */
SUM(foo) {
foo_one,
foo_two,
};
/* declare each sum case */
CASE(foo, foo_one) { int i; char c; };
CASE(foo, foo_two) { double d; };

void do_bar(foo f) {
MATCH(f) {
AS(foo_one, y) printf("foo_one: %d, %c\n", y->i, y->c);
AS(foo_two, y) printf("foo_two: %d\n", y->d);
MATCHANY
fprintf(stderr, "No such case!");
exit(1);
}
}

int main(int argc, char** argv) {
foo f;
LET(f, foo_one, (3, 'g')); /* (3,'g') is an initializer */
do_bar(f);
}

There are a few small requirements and caveats, eg. LET performs dynamic memory allocation. Please see the main libsum page for further details.

License


My default license is LGPL, but since these are macro libraries that's probably not appropriate choice, given there is no binary that can be replaced at runtime (one of the requirements of the LGPL). I like the freedoms afforded by the LGPL though, so I'm open to alternate suggestions with similar terms. I will also consider the MIT license if there are no viable alternatives.