Skip to main content

Prim's Minimum Spanning Tree

Taking a break from Sasa blog posts, I ran into a problem that would be nicely solved by extracting the minimum spanning tree. Turns out this is pretty simple to solve naively in C# using only IEnumerable<T>:

/// <summary>
/// Computes the minimum spanning tree.
/// </summary>
/// <param name="edges">The list of edges.</param>
/// <remarks>
/// Given a list of edges whose vertices are consecutive numbers starting
/// from 0, this extension method returns an enumerable sequence of the
/// edges describing the minimum spanning tree.
/// </remarks>
public static IEnumerable<Edge> MinSpanTree(this IEnumerable<Edge> edges)
{
    var span = edges.ToList();
    var count = 1 + span.Max(x => Math.Max(x.From, x.To));
    edges.Sort();
    var vnew = new BitArray(count);
        vnew[0] = true;
    for (var i = 0; i < count; ++i)
    {
        foreach (var x in edges)
        {
            if (vnew[x.From] ^ vnew[x.To])
            {
                yield return x;
                vnew[x.From] = vnew[x.To] = true;
                break;
            }
        }
    }
}

An Edge is a simple class that encapsulates two vertex numbers and a weight represented by a double. It's also IComparable, so we can sort on it using the weight.

Prim's algorithm selects the minimum weighted edge between a vertex we have not yet accepted, and a vertex that we have accepted. We use a bit array to track the vertex numbers we've accepted, and simply search weight-sorted edge list for the first entry where one vertex is accepted, and the other is not. This is an O(V2), where V is the number of vertices. We could also contract the list after each accepted edge, but that doesn't change the overall time complexity.

A simple adjacency matrix representation using a two-dimensional array of doubles is also easily definable:

/// <summary>
/// Computes the minimum spanning tree.
/// </summary>
/// <param name="edges"></param>
/// <remarks>
/// Given an adjacency matrix describing the weighted edges, where
/// vertices are the indices of the matrix, this extension method
/// returns an enumerable sequence of the edges describing the
/// minimum spanning tree.
/// 
/// Vertices that have no edges between them should have
/// <see cref="double.NaN"/> or <see
/// cref="double.PositiveInfinity"/> in the corresponding
/// matrix entry.
/// </remarks>
public static IEnumerable<Edge> MinSpanTree(this double[,] edges)
{
    if (edges.RowCount() != edges.ColumnCount())
        throw new ArgumentException("Matrix row count must equal column count.", "edges");
    // convert the adjacency matrix into a list of edges
    var count = edges.RowCount();
    var span = new List<Edge>(count);
    for (int i = count - 1; i >= 0; --i)
    {
        for (int j = 0; j < i; ++j)
        {
            var w = edges[i, j];
            if (!double.IsNaN(w) && w < double.PositiveInfinity)
                span.Add(new Edge(i, j, w));
        }
    }
    return span.MinimumSpanningTree();
}

I've included these implementations in Sasa.Numerics.dll for the v0.9.4 release.

Comments

Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre...

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD) , and decided to take a crack at distilling its essence in C#. Note that automatic differentiation (AD) is different than numerical differentiation . Math.NET already provides excellent support for numerical differentiation . C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries . Neither of these are suitable for learning how AD works. So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique. What is Automatic Differentiation? Simply put, automatic differentiation is a technique for calcu...

Easy Reverse Mode Automatic Differentiation in C#

Continuing from my last post on implementing forward-mode automatic differentiation (AD) using C# operator overloading , this is just a quick follow-up showing how easy reverse mode is to achieve, and why it's important. Why Reverse Mode Automatic Differentiation? As explained in the last post, the vector representation of forward-mode AD can compute the derivatives of all parameter simultaneously, but it does so with considerable space cost: each operation creates a vector computing the derivative of each parameter. So N parameters with M operations would allocation O(N*M) space. It turns out, this is unnecessary! Reverse mode AD allocates only O(N+M) space to compute the derivatives of N parameters across M operations. In general, forward mode AD is best suited to differentiating functions of type: R → R N That is, functions of 1 parameter that compute multiple outputs. Reverse mode AD is suited to the dual scenario: R N → R That is, functions of many parameters t...