Tuesday, February 24, 2009

Sasa Reborn!

My .NET Sasa library has fallen by the wayside as I experimented with translating various functional idioms in my FP# library. Reading up on what a few other generic class libraries have been experimenting with, like Mono Rocks, spurred me to putting those experiments to use and updating Sasa. I significantly simplified a lot of the code, documented every class and method, and generalized as much as possible. The license is LGPL v2, and you can download the source via svn:
svn co https://sasa.svn.sourceforge.net/svnroot/sasa/tags/v0.8 sasa

Sasa Core v0.8


A set of useful extensions to core System classes and some useful classes for high assurance development.
  • Named tuple types: Pair, Triple, Quad.

  • Either types, representing one of many possible values. There are Either types for 2, 3, and 4 parameters, mimicking the Pair, Triple, and Quad structure. Tuples are "product" types, while Either is a "sum" type, and products and sums are duals. Since products are useful, I figured variously sized sum types might also find some uses. Time will well.

  • Lazy type, for lazily computed values.

  • An immutable list.

  • Various Ruby-like extensions to core types, like generators for int.UpTo, int.DownTo, string.IsNullOrEmpty, string.Slice, etc.

  • Useful extensions to IEnumerable.

  • "Zip" functions from Haskell for anonymous types and tuple types.

  • A NonNull type which decorates method parameters and ensures those parameters are not null; if NonNull is used pervasively, you can ensure that your program is free of NullReferenceExceptions.

  • An Option type indicating values which may be null. Unlike System.Nullable, this works for class types.

  • Function currying extensions, and extensions to lift multi-parameter functions to single-parameter tupled functions

  • Some convenience extensions to IDictionary.

Sasa.Linq


A stand-alone assembly for Linq development.
  • Default IQueryProvider and IQueryable implementations

  • Generic ExpressionVisitor base class.

  • IdentityVisitor which provides default implementations for all NodeTypes and performs no modifications to the expression, just returning it as-is.

  • ErrorVisitor which which throws NotSupportedException for all NodeTypes.

Sasa.Serialization


A stand-alone assembly with serialization classes.
  • Provides a compact serializer which requires only ReflectionPermission, and not SecurityPermission like the System classes do; this serializer can therefore be used in medium trust environments. The serializer currently requires a little more discipline from the developer to use correctly, but space savings of 100-200% are typical.

  • An experimental unsafe, highly compact binary serializer.

Sasa.Net


A library providing missing functionality under System.Net.
  • A Pop3Client class.

  • MailMessage parsing.

Sasa.CodeContracts


Microsoft Research is developing a design by contract library which they hope to release with .NET 4.0. It's a fairly sophisticated piece of software, that integrates with a static verification tool called Pex. The analysis tools can detect contract violations at compile-time, and even generate test cases for each violation.

Unfortunately, their license forbids commercial application of the pre-release library, even if you just want to utilize runtime contract checking.

Sasa.CodeContracts is a Microsoft API-compatible implementation of the CodeContracts library. This is only a runtime library, and does not provide the Pex integration with static analysis and automated test generation.

Precondition checking is enabled, but postconditions and object invariants require CIL re-writing, so they are not currently supported. I will be looking into using Mono.Cecil to rewrite the IL to support post-conditions and invariants in the future.

TODO for v1.0


There are a few items remaining before v1.0 is released, but the library is usable as-is. Notably missing is MIME parsing for MailMessage, which will be added for v1.0. Also serialization will get improved safety almost on-par with standard framework serialization, and the compaction will be user-customizable for even more space savings in any given program.

Future Work


The Sasa API is fully documented with accompanying XML for code completion. Comments on the clarity of the API and documentation are welcome! Some tutorials on using these features safely are coming as well.

I'm dissatisfied with a few other approaches being pursued on the CLR, including:
  • Current approaches to parallel and concurrent programming, even Microsoft's Parallel Extensions and the Concurrency and Coordination Runtime.

  • CLR security is far too coarse-grained and pretty much unusable.

  • Efficient async I/O is too difficult to reason about (though the CCR does make it easier).

  • In lieu of a Pex static analysis, there is the possibility of QuickCheck-like test suites derived from CodeContract annotations.

Keep an eye on this space for what I come up with.

Tuesday, February 3, 2009

The cost of type tests and casts in C#

Awhile back I ran some tests comparing the dispatching performance of runtime tests+casts against double dispatch. Turned out runtime type tests and casting were noticeably faster than dispatching, probably because they avoid more pipeline stalls.

Unfortunately, there is a "common wisdom" in the .NET world that an "is" test followed by an "as" cast is performing two casts, and in fact one should simply perform the "as" cast then check the result against null:
// prefer this form
string a = o as string;
if (a != null)
{
Console.WriteLine(a);
}

// to this form:
if (o is string)
{
string a = o as string;
Console.WriteLine(a);
}

In fact, that's not the case, as any compiler worth its salt will coalesce the two tests into a single cast and branch operation. I took the tests from the above dispatching and altered them to perform the cast-and-null check, then I ran the tests again with the original is-then-as form. The latter form was about 6% faster on every timing run.

There's obviously some optimization being done here, but the lesson is: don't try to outsmart the compiler. In general, just write code the safe way and let the compiler optimize it for you. It's safer to perform a test then cast within a delimited scope like an if-statement, than to let the possibly null variable float around in the outer scope where you might use it inadvertently later in the method or during refactoring.

If performance turns out to be an issue, profile before trying these sorts of low-level "optimizations", because you might be surprised at the results.