Skip to main content

Immutable Sasa.Collections.Tree vs. System.Collections.Dictionary vs. C5 HashDictionary

I've previously posted about Sasa's hash-array mapped trie, but I never posted any benchmarks. I recently came across this post on Stackoverflow which provided a decent basic benchmark between .NET's default Dictionary<TKey, TValue>, the C5 collection's hash dictionary, F#'s immutable map, and .NET's new immutable collections.

I slightly modified the file to remove the bench against the F# map and the new immutable collections since I'm still using VS 2010, and I added a simple warmup phase to ensure the methods have all been JIT compiled and the GC run to avoid introducing noise:

static void Warmup()
{
    var x = Tree.Make<string, object>();
    var y = new C5.HashDictionary<string, object>();
    var z = new Dictionary<string, object>();
    z.Add("foo", "bar");
    for (var i = 0; i < 100; ++i)
    {
        x = x.Add("foo" + i, "bar");
        y.Add("foo" + i, "bar");
        z.Add("foo" + i, "bar");
        var tmp1 = x["foo" + i];
        var tmp2 = y["foo" + i];
        var tmp3 = z["foo" + i];
    }
    x = default(Tree<string, object>);
    y = null;
    z = null;
    GC.Collect();
}

The results are still somewhat representative. This is a sample of an average output, where "Imm" is Sasa's immutable HAMT:

# - 100
SCGD -          0 MS -         25 Ticks
C5   -          0 MS -        887 Ticks
Imm  -          0 MS -        387 Ticks

# - 1000
SCGD -          0 MS -        257 Ticks
C5   -          0 MS -        294 Ticks
Imm  -          0 MS -        368 Ticks

# - 10000
SCGD -          1 MS -       4084 Ticks
C5   -          1 MS -       5182 Ticks
Imm  -          1 MS -       5436 Ticks

# - 100000
SCGD -         28 MS -      85742 Ticks
C5   -         32 MS -      99280 Ticks
Imm  -         32 MS -      97720 Ticks

Observations:

  1. C5's standard deviation was somewhat wider than both Sasa's HAMT and SCGD, so it's performance seems slightly less predictable
  2. Sasa's immutable HAMT appears to perform within 5% of the mutable C5 collection at all collection sizes
  3. Sasa's immutable HAMT appears to perform within 15% of the mutable SCGD for large collections where the hash table with higher load factors
  4. Small collections requiring a small load factor clearly advantage the mutable SCGD by up to an order of magnitude, an advantage not shared by C5 for some reason (possibly they maintain a higher load factor)
  5. C5's terrible performance on very small collections of 100 items was consistent on every test run, again possibly because they maintain a high load factor before resizing
  6. Sasa's HAMT takes just as much time to load 1000 items as it takes to load 100 items; this was consistent across every test run, and it's not clear why

Finally, while not exactly apples-to-apples, Sasa's HAMT is easily 3-4× faster than F#'s map given the numbers cited in the above Stackoverflow post. F# still has an advantage for very small collections though. Sasa's HAMT also appears to be at least 2× faster than the new immutable collections.

Also keep in mind that this benchmark only tests lookup performance. F#'s map would have an advantage over Sasa's HAMT in load performance because the HAMT does not yet include a "bulk-load" operation, which the F# map does appear to support.

Comments

Popular posts from this blog

Easy Reverse Mode Automatic Differentiation in C#

Continuing from my last post on implementing forward-mode automatic differentiation (AD) using C# operator overloading , this is just a quick follow-up showing how easy reverse mode is to achieve, and why it's important. Why Reverse Mode Automatic Differentiation? As explained in the last post, the vector representation of forward-mode AD can compute the derivatives of all parameter simultaneously, but it does so with considerable space cost: each operation creates a vector computing the derivative of each parameter. So N parameters with M operations would allocation O(N*M) space. It turns out, this is unnecessary! Reverse mode AD allocates only O(N+M) space to compute the derivatives of N parameters across M operations. In general, forward mode AD is best suited to differentiating functions of type: R → R N That is, functions of 1 parameter that compute multiple outputs. Reverse mode AD is suited to the dual scenario: R N → R That is, functions of many parameters t...

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre...

Blue-eyed Islander Puzzle - an analysis

Many people find themselves stumped by the so-called Blue-Eyed Islanders puzzle . There is also much controversy over its supposed solution. I'm going to analyze the problem and the solution, and in the process, explain why the solution works. To begin, let's modify the problem slightly and say that there's only 1 blue-eyed islander. When the foreigner makes his pronouncement, the blue-eyed islander looks around and sees no other blue eyes, and being logical, correctly deduces that his own eyes must be blue in order for the foreigner's statement to make sense. The lone blue-eyed islander thus commits suicide the following day at noon. Now comes the tricky part, and the source of much confusion. Let's say there are 2 blue-eyed islanders, Mort and Bob. When the foreigner makes his pronouncement, Mort and Bob look around and see only each other. Mort and Bob thus both temporarily assume that the other will commit suicide the following day at noon. Imagine their chagrin...