Friday, October 19, 2012

AspQ - A JavaScript Event Queue for ASP.NET

A nearly universal problem in ASP.NET is handling multiple postbacks. There are various server-side and client-side solutions to deal with this, but these solutions can only be understood once you already have the domain knowledge to solve the problem yourself. You then end up repeating the pattern in every project.

Well, programming is about not repeating yourself and automating repetitive tasks. So I've released a reusable abstraction to handle multiple postbacks on the client-side. Because it's a client-side solution, it will only work for browsers with JavaScript enabled, but that's by far the most common case.

AspQ

AspQ is a small JavaScript object that hooks into the ASP.NET JS standard and AJAX runtimes. It basically queues all sync and async postback requests so they don't interfere with one another, and applies the updates in event order. I believe this option provides a superior user experience to simply preventing a submit until the previous postback has completed, because the user can still operate on the UI while they wait.

There have been quite a few incarnations of this idea in the wild, but I've found them all lacking. Either they didn't work for LinkButtons, or they didn't work with Master pages, or they didn't handle sync postbacks. AspQ should do them all.

So go ahead and download AspQ.js from the repository and save it in your site's script directory. You then have a choice of the following two methods to use it.

Method 1

Just include the JS in your pages:

<script type="text/javascript" src="/path/to/AspQ.js"></script>
And place the following server-side code in your OnInit method of the System.Web.UI.Page:
override protected void OnInit(EventArgs e)
{
  ...
  // register form submission script
  this.ClientScript.RegisterOnSubmitStatement(GetType(),
    "PreventDuplicateSubmits", "return AspQ.submit(this);");
  ...
}

Method 2

This method makes only server-side changes using similar changes to the OnInit method:

override protected void OnInit(EventArgs e)
{
  ...
  // include script site-wide
  var script = new HtmlGenericControl("script");
  script.Attributes.Add("type", "text/javascript");
  script.Attributes.Add("src", "/path/to/AspQ.js");
  this.Header.Controls.Add(script);

  // register form submission script
  this.ClientScript.RegisterOnSubmitStatement(GetType(),
    "PreventDuplicateSubmits", "return AspQ.submit(this);");
  ...
}

AspQ is simple to setup and seems adequate for most purposes. If you have a use case that it doesn't handle, please let me know!

It's released under the LGPL, which is my default OSS license, but I'm open to discussion on the issue since the LGPL may not be appropriate for a JavaScript library.

Edit: here's a demo showing a simple counter. Press the +/- buttons as many times and as fast as you like, the updates will all happen and in the proper order.

Saturday, September 29, 2012

Can't Catch Exceptions When Invoking Methods Via Reflection in .NET 4

I just tried updating the Sasa build to .NET 4, when I ran into a bizarre problem. Basically, the following code throws an exception when running under .NET 4 that isn't caught by a general exception handler:

static int ThrowError()
{
    throw new InvalidOperationException(); // debugger breaks here
}
public static void Main(string[] args)
{
    var ethrow = new Func<int>(ThrowError).Method;
    try
    {
        ethrow.Invoke(null, null);
    }
    catch (Exception)
    {
        // general exception handler doesn't work
    }
}

Turns out this is a Visual Studio setting. Given the description there, whatever hooks VS has into the runtime when transitioning between the native and managed code have changed their behaviour from .NET 3.5. So VS isn't aware of the general exception handler further up the stack, and breaks immediately.

I can see this being handy if you're writing reflection-heavy code, as it breaks at the actual source of an exception instead of breaking at the dynamic method invocation as it would under .NET 3.5. It's just annoying when you are already handling it.

Wednesday, August 29, 2012

Managed Data for .NET

Ensō is an interesting new language being developed by Alex Loh, William R. Cook, and Tijs van der Storm. The overarching goal is to significantly raise the level of abstraction, partly via declarative data models.

They recently published a paper on this subject for Onwards! 2012 titled Managed Data: Modular Strategies for Data Abstraction. Instead of programmers defining concrete classes, managed data requires the programmer to define a schema describing his data model, consisting of a description of the set of fields and field types. Actual implementations of this schema are provided by "data managers", which interpret the schema and add custom behaviour. This is conceptually similar to aspect-oriented programming, but with a safer, more principled foundation.

A data manager can implement any sort of field-like behaviour. The paper describes a few basic variants:

  • BasicRecord: implements a simple record with getters and setters.
  • LockableRecord: implements locking on a record, rendering it immutable.
  • InitRecord: implements field initialization on records.
  • ObserverRecord: implements the observer pattern, notifying listeners of any field changes.
  • DataflowRecord: registers field dependencies and recalculates dependent fields on fields that change.

Managed Data for .NET

The core idea of managed data requires two basic concepts, a declarative means of describing the schema, and a means of interpreting that schema to add behaviour. .NET interfaces are a means to specify simple declarative schemas completely divorced from implementations. The following interface can be seen as the IFoo schema containing an immutable integer field and a mutable string field:

// the schema for a data object
public interface IFoo
{
  int Bar { get; }
  string Fooz { get; set; }
}

Data managers then generate concrete instances of IFoo with the desired behaviour. To fit this into a typed framework, I had to reorganize the concepts a little from what appears in the paper:

// creates data instances with custom behaviour
public sealed class DataManager
{
  // create an instance of interface type T
  public T Create<T>();
}

I have a single DataManager type which analyzes the interface T and generates an instance with all the same properties as found in T. The DataManager constructor accepts an instance of ISchemaCompiler, which is where the actual magic happens:

public interface ISchemaCompiler
{
  // next compiler in the chain
  ISchemaCompiler Next { get; set; }
  // a new type is being defined
  void Type(TypeBuilder type);
  // a new property is being defined
  void Property(TypeBuilder type, PropertyBuilder property);
  // a new setter is being defined
  void Setter(PropertyBuilder prop, MethodBuilder setter,
              ILGenerator il);
  // a new getter is being defined
  void Getter(PropertyBuilder prop, MethodBuilder getter,
              ILGenerator il);
}

So DataManager creates a dynamic type implementing an interface, and it calls into the ISchemaCompiler chain while it's generating the various properties. The schema compilers can then output IL to customize the behaviour of the various property getters and setters.

You'll note however that the IFoo schema has an immutable property "Bar". We can specify an initializer for this property using the Schema object that the DataManager uses:

var schema = new Schema();
schema.Type<IFoo>()
      .Default(x => x.Bar, x => 4);

This declares that the Bar property maps to a constant value of 4. It need not be a constant of course, since the initializer is an arbitrary delegate.

The following schema compilers are implemented and tested:

  • BasicRecord: implements the backing fields for the properties.
  • LockableRecord: unlike the paper's lockable record, this version actually calls Monitor.Enter and Monitor.Exit for use in concurrent scenarios.
  • NotifyChangedRecord: implements INotifyPropertyChanged on all properties
  • ChangesOnlyRecord: only assigns the field if the value differs.

Developing programs with managed data consists of only defining interfaces describing your business model and allowing the DataManager to provide the instances. This is obviously also excellent for mocking and unit testing purposes, so it's a win all around.

Here's a simple test program that demonstrates the use of managed data via the composition of ChangesOnlyRecord, NotifyChangedRecord and BasicRecord:

var schema = new Schema();
schema.Type<IFoo>()
      .Default(x => x.Bar, x => 4);
// construct the data manager by composing schema compilers
var record = new BasicRecord();
var dm = new DataManager(schema, new ChangesOnlyRecord
{
    Record = record,
    Next = new NotifyChangedRecord { Next = record }
});
// create instance of IFoo
var y = dm.Create<IFoo>();
var inotify = y as INotifyPropertyChanged;
var bar = y.Bar;
var fooz = y.Fooz;
int count = 0;
Assert(bar == 4);
Assert(fooz == null);
// register notification Fooz changes
inotify.PropertyChanged += (o, e) =>
{
    if (e.PropertyName == "Fooz")
    {
        fooz = y.Fooz;
        count++;
    }
};
// trigger change notification
y.Fooz = "Hello World!";
Assert(fooz == "Hello World!");
Assert(count == 1);
// no change notification since value unchanged
y.Fooz = "Hello World!";
Assert(count == 1);
// trigger second change notification
y.Fooz = "empty";
Assert(fooz == "empty");
Assert(count == 2);

Closing Thoughts

You can download the current implementation here, but note that it's still an alpha preview. I'll probably eventually integrate this with my Sasa framework under Sasa.Data, together with a few more elaborate data managers. For instance, a data manager that uses an SQL server as a backend. Say goodbye to NHibernate mapping files and LINQ attributes, and just let the data manager create and manage your tables!

Saturday, August 25, 2012

M3U.NET: Parsing and Output of .m3u files in .NET

I've been reorganizing my media library using the very cool MusicBrainz Picard, but of course all my m3u files broke. So I wrote the free M3U.NET library, and then wrote a utility called FixM3U that regenerates an M3U file by searching your music folder for the media files based on whatever extended M3U information is available:

> FixM3u.exe /order:title,artist foo.m3u bar.m3u ...

The M3U.NET library itself has a fairly simple interface:

// Parsing M3U files.
public static class M3u
{
  // Write a media list to an extended M3U file.
  public static string Write(IEnumerable<MediaFile> media);
  // Parse an M3U file.
  public static IEnumerable<MediaFile> Parse(
         string input,
         DirectiveOrder order);
  // Parse an M3U file.
  public static IEnumerable<MediaFile> Parse(
         IEnumerable<string> lines,
         DirectiveOrder order);
}

The 3 exported types are straightforward. A MediaFile just has a full path to the file itself and a list of directives supported by the extended M3U format:

// A media file description.
public sealed class MediaFile
{
    // The full absolute path to the file.
    public string Path { get; set; }
    // Extended M3U directives.
    public List<MediaDirective> Directives { get; set; }
}

The directives are represented in this library as key-value pairs:

// An extended M3U directive.
public struct MediaDirective
{
    // The directive name.
    public string Name { get; set; }
    // The directive value.
    public string Value { get; set; }
    // The separator delineating this field from the next.
    public char? Separator { get; set; }
}

The currently supported keys are "Artist", "Title" and "Length".

The M3U format is supposed to order directives as "length, artist - title", but iTunes seems to reverse the order of artist and title. I've thus made this configurable via a parsing parameter of type DirectiveOrder, and you can specify the ordering when parsing:

// The order of the title and artist directives.
public enum DirectiveOrder
{
    // Artist followed by title.
    ArtistTitle,
    // Title followed by artist.
    TitleArtist,
}

Monday, August 20, 2012

Delete Duplicate Files From the Command-line with .NET

Having run into a scenario where I had directories with many duplicate files, I just hacked up a simple command-line solution based on crypto signatures. It's the same idea used in source control systems like Git and Mercurial, basically the SHA-1 hash of a file's contents.

Sample usage:

DupDel.exe [target-directory]

The utility will recursively analyze any sub-directories under the target directory and build an index of all files based on their content. Once complete, duplicates are processed in an interactive manner where the user is presented with a choice of which duplicate to keep

Keep which of the following duplicates:
1. \Some foo.txt
2. \bar\some other foo.doc
>

The types of files under the target directory are not important, so you can pass in directories to documents, music files, pictures, etc. My computer churned through 30 GB of data in about 5 minutes, so it's reasonably fast.

Saturday, July 21, 2012

Simple, Extensible IoC in C#

I just committed the core of a simple dependency injection container to a standalone assembly, Sasa.IoC. The interface is pretty straightforward:

public static class Dependency
{
  // static, type-indexed operations
  public static T Resolve<T>();
  public static void Register<T>(Func<T> create)
  public static void Register<TInterface, TRegistrant>()
            where TRegistrant : TInterface, new()

  // dynamic, runtime type operations
  public static object Resolve(Type registrant);
  public static void Register(Type publicInterface, Type registrant,
                              params Type[] dependencies)
}

If you were ever curious about IoC, the Dependency class is only about 100 lines of code. You can even skip the dynamic operations and it's only ~50 lines of code. The dynamic operations then just use reflection to invoke the typed operations.

Dependency uses static generic fields, so resolution is pretty much just a field access + invoking a delegate. The reason for this speed and simplicity is that it's very light on features, like lifetime management, instance sharing, etc. It's really just the core for dependency injection.

Still, it gets you far because the constructor delegate is entirely user-specified. You can actually build features like lifetime management on top of this core by supplying an appropriate delegate to Register<T>.

For instance, singleton dependencies would look like:

IFoo singleton = null;
Dependency.Register<IFoo>(
() => singleton ?? (singleton = new Foo()));

HTTP request-scoped instances would look something like:

Dependency.Register<IFoo>(
() => HttpContext.Current.Items["IFoo"]
   ?? (HttpContext.Current.Items["IFoo"] = new Foo()) as IFoo);

A thread-local singleton would look something like:

public static class Local
{
  [ThreadStatic]
  internal IFoo instance;
}
...
Dependency.Register<IFoo>(
() => Local.instance ?? (Local.instance = new Foo()));

Instance resolution with sharing is something like:

public static class Instances
{
  internal Dictionary<Type, object> cache =
       new Dictionary<Type, object>();
  internal Func<T> Memoize(Func<T> create)
  {
    T value;
    return cache.TryGetValue(typeof(T), out value)
         ? value
         : cache[typeof(T)] = create();
  }
}
...
Dependency.Register<IFoo>(Instances.Memoize(() => new Foo()));

This container doesn't handle cleanup though, so the thread-local example depends on the client to properly dispose of the thread-local IFoo instance. AutoFac IoC claims to handle disposal of all disposable instances, so I'm reading up a little on how that's done.

This approach seems to handle most common scenarios, but there are no doubt some limitations. Still, it's a good introduction for those curious about IoC implementation.

Tuesday, May 22, 2012

Hash Array Mapped Trie for C# - Feature Complete

I finally got around to finishing the immutable HAMT implementation I wrote about in my last post. The only missing features were tree merging and hash collision handling. Both features are now implemented with unit tests, and the whole branch has been merged back into "default".

It now also conforms to Sasa's standard collection semantics, namely the publicly exported type is a struct, so null reference errors are impossible, and it provides an atomic swap operation for concurrent use. Here's the API:

/// <summary>
/// An immutable hash-array mapped trie.
/// </summary>
/// <typeparam name="K">The type of keys.</typeparam>
/// <typeparam name="T">The type of values.</typeparam>
public struct Tree<K, T> : IEnumerable<KeyValuePair<K, T>>,
                           IAtomic<Tree<K, T>>
{
    /// <summary>
    /// The empty tree.
    /// </summary>
    public static Tree<K, T> Empty { get; }
    /// <summary>
    /// The number of elements in the tree.
    /// </summary>
    public int Count { get; }
    /// <summary>
    /// Find the value for the given key.
    /// </summary>
    /// <param name="key">The key to lookup.</param>
    /// <returns>
    /// The value corresponding to <paramref name="key"/>.
    /// </returns>
    /// <exception cref="KeyNotFoundException">
    /// Thrown if the key is not found in this tree.
    /// </exception>
    public T this[K key] { get; }
    /// <summary>
    /// Add the given key-value pair to the tree.
    /// </summary>
    /// <param name="key">The key.</param>
    /// <param name="value">The value for the given key.</param>
    /// <returns>A tree containing the key-value pair.</returns>
    public Tree<K, T> Add(K key, T value);
    /// <summary>
    /// Remove the element with the given key.
    /// </summary>
    /// <param name="key">The key to remove.</param>
    /// <returns>A tree without the value corresponding to
    /// <paramref name="key"/>.</returns>
    public Tree<K, T> Remove(K key);
    /// <summary>
    /// Merge two trees.
    /// </summary>
    /// <param name="other">The tree to merge with this one.</param>
    /// <returns>
    /// A tree merging the entries from <paramref name="other"/>.
    /// </returns>
    public Tree<K, T> Merge(Tree<K, T> other);
    /// <summary>
    /// Atomically set the slot.
    /// </summary>
    /// <param name="slot">The slot to set.</param>
    /// <returns>True if set atomically, false otherwise.</returns>
    public bool Set(ref Tree<K, T> slot);
}

Wednesday, April 4, 2012

Immutable Hash Array Mapped Trie in C#

I just completed an implementation of an immutable hash array mapped trie (HAMT) in C#. The HAMT is an ingenious hash tree first described by Phil Bagwell. It's used in many different domains because of its time and space efficiency, although only some languages use the immutable variant. For instance, Clojure uses immutable HAMTs to implement arrays/vectors which are essential to its concurrency.

The linked implementation is pretty much the bare minimum supporting add, remove and lookup operations, so if you're interested in learning more about it, it's a good starting point. Many thanks also to krukow's fine article which helped me quickly grasp the bit-twiddling needed for the HAMT. The tree interface is basically this:

/// <summary>
/// An immutable hash-array mapped trie.
/// </summary>
/// <typeparam name="K">The type of keys.</typeparam>
/// <typeparam name="T">The type of values.</typeparam>
public class Tree<K, T> : IEnumerable<KeyValuePair<K, T>>
{
    /// <summary>
    /// The number of elements in the tree.
    /// </summary>
    public virtual int Count { get; }

    /// <summary>
    /// Find the value for the given key.
    /// </summary>
    /// <param name="key">The key to lookup.</param>
    /// <returns>
    /// The value corresponding to <paramref name="key"/>.
    /// </returns>
    /// <exception cref="KeyNotFoundException">
    /// Thrown if the key is not found in this tree.
    /// </exception>
    public T this[K key] { get; }

    /// <summary>
    /// Add the given key-value pair to the tree.
    /// </summary>
    /// <param name="key">The key.</param>
    /// <param name="value">The value for the given key.</param>
    /// <returns>A tree containing the key-value pair.</returns>
    public Tree<K, T> Add(K key, T value);

    /// <summary>
    /// Remove the element with the given key.
    /// </summary>
    /// <param name="key">The key to remove.</param>
    /// <returns>
    /// A tree without the value corresponding to
    /// <paramref name="key"/>.
    /// </returns>
    public Tree<K, T> Remove(K key);
}

No benchmarks yet, it's still early stages. The implementation is based on a few functions from my Sasa class library, primarily some bit-twiddling functions from Sasa.Binary.

The whole implementation is literally 200 lines of code, excluding comments. The only deficiency of the current implementation is that it doesn't properly handle hash collisions. A simple linear chain on collision would be a simple extension. I also need an efficient tree merge operation. I was initially implementing Okasaki's Patricia trees because of their efficient merge, but HAMTs are just so much better in every other way. If anyone has any pointers to efficient merging for HAMTs, I'd be much obliged!

State of Sasa v0.9.4

Sasa itself is currently undergoing an aggressive reorganization in preparation for the 0.9.4 release. A lot of the optional abstractions are moving from Sasa core into their own assemblies. A lot of the useful abstractions are relatively stand-alone. It currently stands as follows, with dependencies listed between []:

Production Ready

  • Sasa [standalone]: tuples, option types that work with both types and structs, string extensions, IEnumerable extensions, thread and null-safe event extensions, type-safe enum extensions, lightweight type-safe wrappers for some system classes, eg. WeakReference and Delegate, extensions for code generation and debugging, and generic number extensions. The goal is to provide only essential extensions to address deficiencies in the system class libraries.
  • Sasa.Binary [standalone]: low-level bit twiddling functions, endian conversions, and portable BinaryReader and BinaryWriter.
  • Sasa.Collections [Sasa, Sasa.Binary]: efficient immutable collections library, including purely functional stacks, queues, lists, and trees. Tree needs some more testing obviously, since it's a rather new addition.
  • Sasa.Mime [standalone]: a simple library encapsulating standard media types and file extensions. It also provides an interface for extending these associations at runtime.
  • Sasa.Statistics [standalone]: a few basic numerical calculations, like standard deviation, and Pierce's criterion used to remove outliers from a data set.
  • Sasa.Net [Sasa, Sasa.Collections]: MIME mail message parsing to System.Net.Mail.MailMessage (most libraries provide unfamiliar, custom mail and attachment classes), a POP3 client, and RFC822 header parsing.
  • Sasa.Contracts [Sasa]: I've used the runtime preconditions subset of the standard .NET contracts for years. I haven't gotten around to adding postconditions and invariants support to ilrewriter.
  • ilrewriter.exe [Sasa]: the IL rewriter currently only erases Sasa.TypeConstraint<T> from your code, which allows you to specify type constraints that C# normally disallows, ie. T : Delegate, or T : Enum.

Beta

These abstractions work, but haven't seen the production use or stress testing the above classes have.

  • Sasa.TM [Sasa]: software transactional memory, super-fast thread-local data (much faster than ThreadLocal<T>!).
  • Sasa.Reactive [Sasa]: building on Rx.NET, this provides Property<T> which is a mutable, reactive cell with a getter/setter. Any changes automatically propagate to observers. NamedProperty<T> inherits from Property<T> and further implements INotifyPropertyChanged and INotifyPropertyChanging.
  • Sasa.Parsing [Sasa]: implements a simple, extensible Pratt parser. Grammars are generic and can be extended via standard inheritance. The test suite is extensive, although I've only use this in private projects, not production code.
  • Sasa.Linq [standalone]: base classes for LINQ expression visitors and query providers. Not too uncommon these days, but I've had them in Sasa for many years.

Currently Broken

These assemblies are undergoing some major refactoring, and are currently rather broken.

  • Sasa.Dynamics [Sasa, Sasa.Collections]: blazingly fast, type-safe runtime reflection. This code underwent significant refactoring, and I recently realized that the patterns being used here could be abstracted even further by providing multiple dispatch for .NET. See the multiple-dispatch branch of the repository for the current status of that work. This should be complete for Sasa v0.9.4.
  • Sasa.Serialization [Sasa, Sasa.Dynamics]: a compact, fast serializer based on Sasa.Dynamics. Waiting on the completion of Sasa.Dynamics.

Deprecated

These assemblies are now deprecated, either because they saw little use, were overly complex, or better alternatives now exist.

  • Sasa.Concurrency [Sasa]: primarily an overly complex implementation of futures based on Alice ML semantics. In a future release, Sasa.Concurrency will strip out futures, absorb Sasa.TM, and also provide a deterministic concurrency library based on concurrent revision control, which I believe to be inherently superior to STM.

Toy

These assemblies are not really meant for serious use, primarily because they don't fit with standard .NET idioms.

  • Sasa.FP [Sasa, Sasa.Collections]: some less useful functional idioms, like delegate currying and tupling, binomial trees, trivial immutable sets, and either types.
  • Sasa.Arrow [Sasa]: a convenient interface for arrows. Definitely not a idiomatic .NET!
  • Sasa.Linq.Expressions [Sasa]: extensions to compose LINQ expressions. Also provides some typed expression trees, as opposed to the standard untyped ones. In theory it should work, and the code all type checks, but pretty much 0 testing at the moment.

Thursday, March 22, 2012

Simplest Authentication in Lift

Lift has been an interesting experience so far, particularly since I'm learning Scala at the same time. Lift comes with quite a few built-in mechanisms to handle various features, such as authentication, authorization and role-based access control.

A lot of the documentation utilizes these built-ins to good effect, but because the core mechanisms are complete skipped, you have no idea where to start if you have to roll your own authentication. A suggestion for Lift documentation: cover the basic introduction first, then show how Lift builds on that foundation.

I present here the simplest possible authentication scheme for Lift, inspired by this page on the liftweb wiki:

object isLoggedIn extends SessionVar[Boolean](false)
...

// in Boot.scala
LiftRules.loggedInTest = Full(() => isLoggedIn.get)

That last line only needs to return a boolean. If you wish to include this with Lift's white-listed menu system, you merely need to add this sort of test:

val auth = If(() => !Authentication.user.isEmpty,
              () => RedirectResponse("/index"))
val entries = 
    Menu(Loc("Login", "index" :: Nil, "Login", Hidden)) :: 
    Menu(Loc("Some Page", "some-page" :: Nil, "Some-Page", auth)) ::
    Nil
SiteMap(entries:_*)

Any request other than /index that is not authenticated, ie. isLoggedIn.get returns false, will redirect to /index for login.

One caveat: since the authenticated flag session-level data, you are vulnerable to CSRF attacks unless you utilize Lift's built-in CSRF protection, where input names are assigned GUIDs. This is the default, but since it is easy to circumvent this to support simple query forms and the like, it's worth mentioning.

Monday, March 12, 2012

Debugging Lift 2.4 with Eclipse

To continue my last post, launching a Lift program and debugging from Eclipse turns out to be straightforward.

The starting point was this stackoverlow thread which pointed out the existence of the RunJettyRun Eclipse plugin, which can launch a Jetty instance from within Eclipse configured for remote debugging. Here are the steps to get launching and debugging working seamlessly:

  1. Install RunJettyRun from within Eclipse the usual way, ie. menu Help > Install New Software, then copy-paste this link.
  2. Once installed, go to menu Run > Debug Configurations, and double-click Jetty Webapp. This will create a new configuration for this project.
  3. Click Apply to save this configuration, and you can now start debugging to your heart's content.

NOTE: running Jetty in SBT via ~container:start puts the web app in the root of the web server, ie. http://localhost:8080/, but this plugin defaults to http://localhost:8080/project_name. You can change this via the "Context" parameter in the debug configuration. This defaults to the project name, presumably so you can run/debug multiple web apps simultaneously.

Getting Started with Scala Web Development - Lift 2.4 and SBT 0.11.2

Anyone following this blog knows I do most of my development in C#, but I recently had an opportunity for a Java project, so I'm taking the Scala plunge with Lift. It's been a bit of a frustrating experience so far since all of the documentation on any Java web framework assumes prior experience with Java servlets or other Java web frameworks.

Just a note, I'm not intending to bash anything, but I will point out some serious flaws in the tools or documentation that I encountered which seriously soured me on these tools. This is not a reason to get defensive, but it's an opportunity to see how accessible these tools are to someone with little bias or background in this environment.

In this whole endeavour, it was most frustrating trying to find a coherent explanation of the directory structures used by various build tools, JSPs and WARs. I finally managed to find a good intro for Lift that didn't assume I already knew how files are organized, and that started me on the right path. I soon after found a concise review of servlet directory structures written for the minimalist web4j framework which filled in the rest.

Of the IDEs I tried, IntelliJ IDEA and Eclipse both support Maven Lift project templates out of the box. However, Maven is not itself a dependency that they resolve and install for you. Combined with the fact that Maven is no longer the recommended method of building Lift programs, I saw little choice but to give that up as a bad job [1]. There is no working SBT support within these IDEs that I could see. IDEA had an SBT plugin, but it never worked for me.

So I set about playing with the recommended Simple Build Tool (SBT), but of course, the version shipped with Lift 2.4 is way out of date (0.7.7 IIRC, while the latest is 0.11.2). As a result, the terminology has changed somewhat, as the article I linked above mentions. I saw no reason to learn something already outdated, so my first project was to update the lift_blank build to SBT 0.11.2. This started the 13 page journey through SBT's getting started guide, much of which you have to understand to grok lift_blank's trivial build.

Fortunately, SBT is pretty cool. It can automatically download and register dependencies using Apache Ivy, and launch your web programs in Jetty for testing purposes. However, the intro and the default setup could definitely use a little work (at least for Lift documentation). For instance, to even make SBT usable in Lift scenarios, you need the xsbt-web-plugin, which itself requires you to define a plugin.sbt file in the appropriate location. What plugins are and where exactly plugins.sbt should go is on page 11 of the SBT getting started guide, and the need for that web plugin was buried in another page somewhere I accidentally found via Google after much searching on "not found" errors for "seq(webSettings :_*)".

The Final Solution

I present here the final toolset and configuration I settled on, combined with the updated and simplified build script for lift_blank which enabled me to finally work with code in a semi-usable environment:

  1. Download Eclipse Helios, and install it as you prefer. Helios is one version behind, but it's needed for Scala-IDE.
  2. Install the Scala-IDE Eclipse plugin.
  3. Install Scala 2.9.1. I'm not convinced this step is necessary, but it's what I did.
  4. Download Lift 2.4.
  5. Unpack and go into lift-lift_24_sbt-f911f30\scala_29\lift_blank.
  6. Delete project\build directory, and sbt-launcher.jar, sbt, sbt.bat. The only remaining sbt-specific file is project\build.properties.
  7. Open project\build.properties in a text editor, and replace "sbt.version=0.7.7" with "sbt.version=0.11.2", "def.scala.version=2.7.7" with "def.scala.version=2.9.1".
  8. Download sbt-launch.jar from this page and place it in the lift_blank root directory.
  9. Create an sbt.bat in the lift_blank root directory file with the following contents:
    set SCRIPT_DIR=%~dp0
    java -Xmx512M -jar "%SCRIPT_DIR%sbt-launch.jar" %*
    This simply runs the sbt-launch.jar you just downloaded.
  10. Run sbt.bat. You will get an error, but this will create the ~/.sbt directory where global SBT settings are stored.
  11. Go to your root user directory, ie. in Windows 7, this is the start menu > [Your User Name]. You should see a .sbt folder there. Enter it and create a folder called "plugins".
  12. Under the above plugins folder, create a file called "plugins.sbt" with the following contents:
    addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "2.0.0")
    
    libraryDependencies <+= sbtVersion(v => "com.github.siasia" %% "xsbt-web-plugin" % (v+"-0.2.11"))
    This simply adds Eclipse and web app support to SBT. You could also put this plugins folder under lift_blank\project, but I just made it a global setting so I never have to fiddle with it again.
  13. Return to folder lift_blank, and create a file called "build.sbt" with the following contents:
    name := "lift_blank"
    
    version := "1.0"
    
    scalaVersion := "2.9.1"
    
    seq(webSettings :_*)
    
    libraryDependencies ++= Seq(
        "net.liftweb" %% "lift-webkit" % "2.4" % "compile->default",
        "org.mortbay.jetty" % "jetty" % "6.1.26" % "container,test",
        "junit" % "junit" % "4.7" % "test",
        "ch.qos.logback" % "logback-classic" % "0.9.26",
        "org.scala-tools.testing" %% "specs" % "1.6.9" % "test")
    This is the simplified and updated build file for lift.
  14. Now execute sbt.bat and you should get an SBT command prompt. Type "compile" and watch SBT download all the dependencies listed above in build.sbt, and compile the source code.
  15. Now type "eclipse" at SBT's command prompt, and this will generate Eclipse project files.
  16. Launch Eclipse, and go to File > Import, select General > Existing Projects into Workspace, then click Next. Navigate to and select your lift_blank folder and click Finish.
  17. Go to menu Project > Properties, and select the Builders option on the left hand side.
  18. Click New, select Program, click OK. This will bring up an "Edit Configuration" dialog.
  19. In the "Name" field, type "SBT Builder".
  20. Under the "Location" section, click Browse Workspace, and select sbt.bat.
  21. Under the "Working Directory" section, click Browse Workspace, and select lift_blank, ie. your project name.
  22. Under the "Arguments" section, type in compile, and click OK.

You now have a working, update to date SBT build and a valid Eclipse project. Happy hacking!

You can start Jetty by executing sbt ~container:start.

Suggestions

There is no way it should be this convoluted to get a working build and a working IDE. This solution isn't even complete, as I'm still working on getting the tests to run and Jetty to launch, and I hold out little hope for debugging. It also seems like any change to the build, like adding a new dependency, requires regenerating the Eclipse files and restarting Eclipse. Any suggestions here would be much appreciated! An SBT plugin for Eclipse would be most welcome. :-)

Some suggestions for Lift documentation: a complete intro to directory structures would only cost a few paragraphs, and would make your project accessible even to beginners. A short review of the build tool would also be helpful.

Suggestions for SBT:

  1. Remove 90% of the documentation in the getting started guide. In particular, getting started generally doesn't involve recursive builds, contexts, key dependencies, scopes and how it all works at the Scala level, ie. immutable sequences, etc. I think the getting started guide should be focused on directory layout, settings, simple library dependencies, common commands, and batch/interactive mode, all using the simplified .sbt build file. This shouldn't take more than 2 pages, and anything else should be in an advanced guide.
  2. I'm still learning Scala, but I still stumble over libraryDependencies. I often get confused over exactly which column version numbers go in. It would help tremendously if there were some named parameter syntax that would make this obvious, ie. something like:
    dependency(
      package: "net.liftweb",
      jar: "lift-webkit",
      version: "2.4",
      into: "compile->default")

[1] As an aside, I'd just like to mention that IntelliJ was particularly annoying because I couldn't seem to open any file that wasn't already part of a project. Why can't I just open any old text file? Drag and drop certainly didn't work, and adding a folder to a project that already exists in the project directory also seemed impossible. I haven't used Eclipse as much yet, but I will no doubt find similar irritations. One instant irritation is the lack of installer for Windows, though I can appreciate how that might be a low priority for Eclipse devs.

Sunday, March 4, 2012

Oh C#, why must you make life so difficult?

Ran into a problem with C#'s implicit conversions, which don't seem to support generic types:

class Foo<T>
{
    public T Value { get; set; }
    public static implicit operator Foo<T>(T value)
    {
        return new Foo<T> { Value = value };
    }
}
static class Program
{
    static void Main(string[] args)
    {
        // this is fine:
        Foo<IEnumerable<int>> x = new int[0];

        // this is not fine:
        Foo<IEnumerable<int>> y = Enumerable.Empty<int>();
        //Error 2: Cannot implicitly convert type 'IEnumerable<int>'
        //to 'Foo<IEnumerable<int>>'. An explicit conversion
        //exists (are you missing a cast?
    }
}

So basically, you can't implicitly convert nested generic types, but implicit array conversions work just fine.

Tuesday, February 21, 2012

Reusable Ad-Hoc Extensions for .NET

I posted awhile ago about a pattern for ad-hoc extensions in .NET using generics. Unfortunately, like every "design pattern", you had to manually ensure that your abstraction properly implements the pattern. There was no way to have the compiler enforce it, like conforming to an interface.

It's common wisdom that "design patterns" are simply a crutch for languages with insufficient abstractive power. Fortunately, .NET's multicast delegates provides the abstractive power we need to eliminate the design pattern for ad-hoc extensions:

/// <summary>
/// Dispatch cases to handlers.
/// </summary>
/// <typeparam name="T">The type of the handler.</typeparam>
public static class Pattern<T>
{
    static Dispatcher<T> dispatch;
    static Action<T, object> any;

    delegate void Dispatcher<T>(T func, object value,
                                  Type type, ref bool found);

    /// <summary>
    /// Register a case handler.
    /// </summary>
    /// <typeparam name="T0">The argument type.</typeparam>
    /// <param name="match">Expression dispatching to handler.</param>
    public static void Case<T0>(Expression<Action<T, T0>> match)
    {
        var call = match.Body as MethodCallExpression;
        var handler = Delegate.CreateDelegate(typeof(Action<T, T0>),
                                              null, call.Method)
                   as Action<T, T0>;
        dispatch += (T x, object o, Type type, ref bool found) =>
        {
            // if type matches exactly, then dispatch to handler
            if (typeof(T0) == type)
            {
                found = true;   
                handler(x, (T0)o);
            }
        };
    }
    /// <summary>
    /// Catch-all case.
    /// </summary>
    /// <param name="match">Expression dispatching to handler.</param>
    public static void Any(Expression<Action<T, object>> match)
    {
        var call = match.Body as MethodCallExpression;
        var handler = Delegate.CreateDelegate(typeof(Action<T,object>),
                                              null, call.Method)
                   as Action<T, object>;
        any += handler;
    }

    /// <summary>
    /// Dispatch to a handler for <typeparamref name="T0"/>.
    /// </summary>
    /// <typeparam name="T0">The value type.</typeparam>
    /// <param name="value">The value to dispatch.</param>
    /// <param name="func">The dispatcher.</param>
    public static void Match<T0>(T0 value, T func)
    {
        bool found = false;
        dispatch(func, value, value.GetType(), ref found);
        if (!found)
        {
            if (any == null) throw new KeyNotFoundException(
                                       "Unknown type.");
            else any(func, value);
        }
    }
}

The abstraction would be used like this:

interface IFoo
{
    void Bar(int i);
    void Foo(char c);
    void Any(object o);
}
class xFoo : IFoo
{
    public void Bar(int i)
    {
        Console.WriteLine("Int: {0}", i);
    }
    public void Foo(char c)
    {
        Console.WriteLine("Char: {0}", c);
    }
    public void Any(object o)
    {
        Console.WriteLine("Any: {0}", o);
    }
}
static void Main(string[] args)
{
    Pattern<IFoo>.Case<int>((x, i) => x.Bar(i));
    Pattern<IFoo>.Case<char>((x, i) => x.Foo(i));

    Pattern<IFoo>.Match(9, new xFoo());
    Pattern<IFoo>.Match('v', new xFoo());
    try
    {
        Pattern<IFoo>.Match(3.4, new xFoo());
    }
    catch (KeyNotFoundException)
    {
        Console.WriteLine("Not found.");
    }
    Pattern<IFoo>.Any((x, o) => x.Any(o));
    Pattern<IFoo>.Match(3.4, new xFoo());
    // prints:
    // Int: 9
    // Char: v
    // Not found.
    // Any: 3.4
}

Unlike the previous pattern for ad-hoc extensions, dispatching is always precise in that it dispatches to the handler for the value's dynamic type. The previous solution dispatched only on the static type. This can also be a downside, but you could easily extend the Match method to test on subtypes as well.

The other downside of this solution is that it's not quite as fast since all the type tests are run on each dispatch, where the previous solution cached the specific delegate in a static generic field. This caching can be added to the above class as well. Then, you can have the best of both worlds if you happen to know that the static type is the same as the dynamic type.

Saturday, February 4, 2012

Why Sealed Classes Should Be Allowed In Type Constraints

One of my older posts on Stackoverflow listed some of what I consider to be flaws of C# and/or the .NET runtime. A recent reply to my post posed a good question about one of those flaws, which was that sealed classes should be allowed as type constraints. That seems like a sensible restriction for C# at first, but there are legitimate programs that it disallows.

I figured others would have run into this problem at some point, but a quick Google search didn't turn up much, so I will document the actual problem with this rule. Consider the following interface:

interface IFoo<T>
{
    void Bar<U>(U bar) where U : T;
}

The important part to notice here is the type constraint on the method, U : T. This means whatever T we specify for IFoo<T>, we should be able to list as a type constraint on the method Bar. Of course, if T is a sealed class, we cannot do this:

class Foo : IFoo<string>
{
    public void Bar<U>(U bar)
      where U : string //ERROR: string is sealed!
    {
    }
}

In this case, there's a workaround by allowing the compiler to infer the constraint by making the method private and visible only when coerced as an interface:

class Foo : IFoo<string>
{
    void IFoo<string>.Bar<U>(U bar)
    {
    }
}

But this means that you cannot call Bar on a Foo, you need to first cast it to an IFoo<string>, a completely unnecessary step.

In principle, every type constraint that the compiler can infer implicitly, we should be able to specify explicitly. This is clearly not the case here, and there is no reason for it. It's a purely an aesthetic restriction, not a correctness restriction, that the C# compiler devs took extra effort to implement.

And that is why we should allow sealed classes as type constraints.

Thursday, February 2, 2012

Diff for IEnumerable<T>

I've just added a simple diff algorithm under Sasa.Linq. The signature is as follows:

/// <summary>
/// Compute the set of differences between two sequences.
/// </summary>
/// <typeparam name="T">The type of sequence items.</typeparam>
/// <param name="original">The original sequence.</param>
/// <param name="updated">The updated sequence to compare to.</param>
/// <returns>
/// The smallest sequence of changes to transform
/// <paramref name="original"/> into <paramref name="updated"/>.
/// </returns>
public static IEnumerable<Change<T>> Difference<T>(
    this IEnumerable<T> original,
    IEnumerable<T> updated);
/// <summary>
/// Compute the set of differences between two sequences.
/// </summary>
/// <typeparam name="T">The type of sequence items.</typeparam>
/// <param name="original">The original sequence.</param>
/// <param name="updated">The updated sequence to compare to.</param>
/// <param name="eq">The equality comparer to use.</param>
/// <returns>The smallest sequence of changes to transform
/// <paramref name="original"/> into <paramref name="updated"/>.
/// </returns>
public static IEnumerable<Change<T>> Difference<T>(
    this IEnumerable<T> original,
    IEnumerable<T> updated,
    IEqualityComparer<T> eq);

The extension methods depend only on the following enum and struct:

/// <summary>
/// Describes the type of change that was made.
/// </summary>
public enum ChangeType
{
    /// <summary>
    /// An item was added at the given position.
    /// </summary>
    Add,
    /// <summary>
    /// An item was removed at the given position.
    /// </summary>
    Remove,
}
/// <summary>
/// Describes a change to a collection.
/// </summary>
/// <typeparam name="T">The collection item type.</typeparam>
public struct Change<T>
{
    /// <summary>
    /// The change made at the given position. 
    /// </summary> 
    public ChangeType ChangeType { get; internal set; } 
    /// <summary> 
    /// The set of values added or removed from the given position. 
    /// </summary> 
    public IEnumerable<T> Values { get; internal set; } 
    /// <summary> 
    /// The position in the sequence where the change took place. 
    /// </summary> 
    public int Position { get; internal set; }
} 

This is a simple and general interface with which you can perform all sorts of computations on the differences between two sequences. The code as provided will work out of the box for any type T that implements equality. Some simple examples:

Console.WriteLine( "miller".Difference("myers").Format("\r\n") );
// prints out: 
// +1:y 
// -1:i,l,l 
// +6:s 

var original = new int[] { 2, 5, 99 }; 
var updated = new int[] { 2, 4, 4, 8 }; 
Console.WriteLine( original.Difference(updated).Format("\r\n") ); 
// prints out: 
// +1:4,4,8 
// -1:5,99 

"Format" is a simple extension method also under Sasa.Linq with generates a formatted string when given an IEnumerable.

At the moment, I simply implemented the naive algorithm that takes N*M space and time. I plan to eventually implement some linear space optimizations, as described in An O(ND) Difference Algorithm and Its Variations.

There are many applications for a general difference algorithm like this. Consider a reactive property of type IEnumerable, like as used in a drop down for a user interface. If the UI is remote, as you find in X11 or a web browser, sending the entire list over and over again is bandwidth-intensive, and trashes the latency of the UI. It's much more efficient to just send the changes, which can be accomplished by taking the diff of the original and the new list.

Sunday, January 8, 2012

Clavis - A Web Security Microframework

Web programming has been a pretty big deal for over 10 years now, but in some ways the tools web developers use haven't really progressed that much, particularly when it comes to security. For instance, CSRF and clickjacking are instances of the Confused Deputy problem, a security problem known since at least 1988.

Since these are both instances of the same underlying problem, in principle they should have the same or very similar solutions. However, the current solutions to these pervasive vulnerabilities are designed to solve only the specific problem at hand, and not the general problem of Confused Deputies. This means that if another HTML or Flash enhancement comes along that introduces another Confused Deputy, these solutions will not necessarily prevent exploitation of that vulnerability.

However, if we solve the underlying Confused Deputy problem, then that solution will address all present and future Confused Deputies, assuming any new features don't violate the constraints the solution specifies. Fortunately, the solution to the Confused Deputy has been known since it was first recognized: capabilities.

Capabilities

Capabilities are in fact a very simple and general framework for representing and reasoning about the flow of authority in a program. You've used capabilities many times without even knowing it. For instance, if you've ever received an activation e-mail with a long, unguessable URL, or if you've ever received a link to a Google Docs document.

There was no login requirement to load these links, no account information you had to input or password you had to provide. These types of links just bring you to the resource you want to access and whatever loads in your browser lets you know what you can do. Sometimes you can only read that resource, and sometimes there are some fields via which you can update the resource you're viewing, or other possibly update other resources. These types of URLs are capabilities.

The Confused Deputy problem inherently relies on references to resources being forgeable, like a file path, and the permissions for that resource being separate from the reference itself. The latter criterion is called "ambient authority" in capability parlance.

Capabilities solve the Confused Deputy because references are inherently unforgeable and include the permissions on that object. So if you have a capability, you have necessary and sufficient permission to operate on that object, period, full stop. I won't go into capabilities and Confused Deputies any further since there are plenty of online references describing them, but if there's enough interest I will write something up for a future post.

The quintessential capability web framework is Waterken, and its capabilities are called "web-keys".

Capabilities are not a panacea though. In the past, I found the Waterken programming model somewhat unintuitive. In my experience, most programming proceeds by designing a model that reflects the problem domain, which you then compose with models that cover other, often orthogonal, problem domains.

Example problem domains are data persistence, web request handling, and the usual models a programmer might have to write himself, like an order processing system. Most of these models are orthogonal, so each problem a developer needs to tackle can often be tackled independently, with a little glue to compose them into a final program.

Part of the problem with Waterken-style capabilities is that the data model of a program is more tightly coupled to the authorities as viewed by the user. These often take the form of capability patterns like "facets", which are used for attenuation, revocation and delegation of authority.

However, in traditional programming, security is most often modelled separately like every other concern, and evolves independently. This orthogonal thinking doesn't work with capabilities as they are currently provided. In Waterken, you have to compose the capability model and your data model in your head before you can even write any code, and this is why it's a little more difficult than necessary in my opinion. Combined with the fact that Waterken already has a built-in notion of persistence, and it's not exactly a drop-in solution.

However, it's undeniable that capabilities provide a natural framework for preventing Confused Deputies, so a good web framework will probably look a bit like capabilities.

Continuations

In my opinion, the biggest advance on the server-side was the development of continuations to model web interactions. If used correctly, they provide a natural framework for specifying and reasoning about the state needed by a request, and how that request can flow into other requests.

Every web program is unequivocally continuation-based, simply due to the stateless nature of HTTP and the various browser features, eg. back buttons. However, there have been many extensions to HTTP to try and mimic statefulness. Most of these extensions are still expressible as abstractions on top of continuations, although when these abstractions are used to carry credentials of various types, as is often done with cookies, a web program is susceptible to Confused Deputies [1].

Continuations are particularly advantageous in a statically typed language. A web framework in a typed language can define continuation abstractions specifying the static types of the parameters the continuation needs to satisfy a request. The framework itself can then guarantee that any requests for that resource have precisely the right types, and even the right names. This eliminates quite a bit of duplicate sanity checking code.

A good web framework will make the continuations inherent to web development somewhat obvious to maximize simplicity and flexibility.

Summary of Problems

Fundamentally, the problems with typical web frameworks are simple:

  1. Pervasive statefulness, ie. sessions and cookies.
  2. No standard, reliable way to protect parameters from tampering.
  3. Authorization context is ambient, leading to confused deputies like CSRF and clickjacking.
  4. No way to know at a glance which parameters require sanity checking, and which can be relied on.

Clavis

Taking the above lessons to heart, I present here a simplistic web security microframework. It trivially integrates with ASP.NET, but does not inherently depend on the page model. System.Web.Page already naturally satisfies Clavis' requirements though, so you can get right to developing your program.

What Clavis provides, addressing some of the failings of typical approaches:

  1. A session-free, ambient authority-free state model via continuations.
  2. A standard parameter protection semantics that prevents tampering, ie. parameters are unforgeable by default.
  3. Authorization context is explicit as a parameter. Combined with unforgeability, this automatically prevents Confused Deputies like CSRF and clickjacking.
  4. A parameter's forgeability is specified in the type and enforced by the compiler, so the need for sanity checking is obvious at a glance.

State Model - Continuations

The core concept is that of a set of interfaces defining continuations with various parameters required to load the page:

public interface IContinuation : ...
{
}
public interface IContinuation<T> : ...
{
}
public interface IContinuation<T0, T1> : ...
{
}
public interface IContinuation<T0, T1, T2> : ...
{
}
public interface IContinuation<T0, T1, T2, T3> : ...
{
}
...

A page class you define will then simply declare that it implements a certain continuation interface, and specify the types of the parameters the page requires to load:

public class Foo : System.Web.Page, IContinuation<User, SalesOrder>
{
  ...
}

The requirements on IContinuation are already satisfied by System.Web.Page, so there is nothing to implement. A page can also implement multiple continuation types, However, implementing the IContinuation interface gets you access to extension methods that allow you to name and parse parameters:

public class Foo : System.Web.Page, IContinuation<User, SalesOrder>
{
  User user;
  SalesOrder order;
  protected void Page_Load(object sender, EventArgs e)
  {
    if (!this.TryParse0(out user, userId => LoadUser(userId)))
      throw ArgumentException("Invalid request!");

    if (!this.TryParse1(out order, orderId => LoadOrder(orderId)))
      throw ArgumentException("Invalid request!");
  }
}

You can generate URLs from a continuation based on the page's fully qualified type name:

var link = Continuation.ToUrl<Foo, User, SalesOrder>(
                        user.AsParam(), order.AsParam());

If you just want to redirect to a continuation:

Continuation.Display<Foo, User, SalesOrder>(
             user.AsParam(), order.AsParam());

For simplicity, Clavis requires that the fully qualified names of the continuations map to the URL path. So a continuation class Company.Warehouse.Foo, will map to ~/Company/Warehouse/Foo/. If the continuation is a System.Web.Page, you can simply place it under that directory and name it Default.aspx, and IIS will load it.

You can also customize the name used for a parameter if you really want to, although you must then keep the names used when generating the URL and the name used when parsing the parameter in sync.

Unforgeable Parameters

Each of a continuation's arguments are unforgeable, meaning that a URL constructed from a program is guaranteed to be free from tampering. A URL constructed for Foo above, might look something like this:

http://host.com/Foo/?!User=123&!SalesOrder=456&clavis=6dc8ca2b

Every query parameter prefixed with ! is guaranteed to be tamper proof, which means they can be used as unforgeable references. The unforgeability is guaranteed by the "clavis" query parameter, which is an HMAC of the URL path and the parameters prefixed by !.

By default, continuation arguments generate tamper-proof query parameters, with no further work on the developer's part. The developer simply needs to provide the functions to convert a type to and from a string that designates that type, ie. a UserId, a SalesOrderId, etc.

The query parameters are named according to the type of object that parameter is supposed to represent as specified in the IContinuation type definition. You should avoid using "int" or "string" since those aren't very meaningful. Foo from above generated "User" and "SalesOrder" parameters, but if I were instead write a class like this:

public class Meaningless : System.Web.Page, IContinuation<int, string>
{
  ...
}

you'd end up with a URL that isn't very meaningful:

http://host.com/Meaningless/?!int=123&!string=456&clavis=0a1a15d0

So instead of specifying that your continuation accepts integers or strings that represent an object in your data model, just specify the data model objects in your continuation as I did with Foo above.

Generating a URL for Foo from above is probably a little more complicated than I showed before, since you will often need to specify a string conversion:

var link = Continuation.ToUrl<Foo, User, SalesOrder>(
                        user.AsParam(u => u.UserId),
                        order.AsParam(o => o.OrderId));

Loading the parameter on the receiving end depends on the position in the continuation argument list. The first parameter is loaded via TryParse0, the second via TryParse1, and so on.

In general, the AsParam() extension methods are used to generate URL parameters with string conversions, and TryParseX() extension methods are to load data model objects from the continuation arguments via conversion from string. You could also add your own extension methods for AsParam() so you don't have to specify the string conversion every time. It's not quite so easy for TryParseX since you need to provide an overload for each possible argument index.

Explicit Authorization

Since all state is intended to be passed via continuation arguments, this includes any user identification or other authorization information, like a UserId:

http://host.com/Foo/?!User=123&!SalesOrder=456&clavis=6dc8ca2b

Because the User parameter is unforgeable, this means the explicit authorization information needed to operate on SalesOrder 456 is guaranteed to be reliable. No attacker will be able to generate the correct "clavis" parameter to forge such a request.

With plain continuations and unforgeable references discussed so far, all URLs are straightforward capabilities. This means if someone ever accidentally finds a legitimate URL, they can access the content at that URL.

Consider being on the above page for Foo, and it contains an external link. If you click that link, the server receiving the request will get the above URL in its entirety, as do any HTTP caches along the way if the connection isn't encrypted. It's a hideous security leak. There are plenty of techniques to mitigate this problem, but it's a general symptom of capabilities. For instance, capability URLs are also susceptible to shoulder-surfing attacks.

Clavis addresses this by providing a mechanism for tying its intrinsic capability URLs to "sessions". A session is basically just like an authentication cookie you would use on a typical website, except it need not carry any meaningful credentials. It's merely another random token that is included in the MAC so resulting URL is unique and tied to the lifetime of that token, and all such capability "leaks" are automatically plugged so long as the token is not also leaked.

The URL generated for a session-limited continuation will look like:

http://host.com/Foo/?!User=123&!SalesOrder=456&clavis-auth=faee8ce2

Notice that the "clavis" parameter is now called "clavis-auth", so Clavis knows that this request should include a session token in the MAC.

Clavis is agnostic about how this token is generated and stored on the server. You don't even have to store it at all, technically. The token can technically be anything, and you can even just use ASP.NET's FormsAuthentication cookies, or the ASP.NET session id for the token. This configuration happens on application load:

Continuation.Init(Convert.FromBase64String(clavisKey), () =>
{
    var x = HttpContext.Current.Session;
    return x == null ? "" : x.SessionID;
});

"clavisKey" in the above is a 64-bit random byte array used as the private key for generating the MAC. Storing it as a parameter in the web.config file would do for most purposes. If you ever change this key, all previous URLs MAC'd with that key will break, but sometimes that's what you need. For instance, if this key is ever disclosed, you want to change it immediately to eliminate the threat.

Sanity Checked Parameters

By default, continuation arguments are unforgeable, so they don't require any sanity checking. This is unnecessarily restrictive however, since there are many HTTP patterns where we want the client to provide or modify some URL parameters. Ideally, we could easily identify such parameters so we can tell at a glance which ones need sanity checking.

In Clavis, this is achieved via Data<T>. A continuation whose argument type is wrapped in Data<T> will generate a forgeable query parameter, not an unforgeable one:

public class Foo2 : System.Web.Page,
                    IContinuation<User, Data<SalesOrder>>
{
  ...
}

This indicates that the second argument of type SalesOrder is pure data, and not unforgeable, and so it requires strict sanity checking before being used. Foo2 will generate URL that looks like:

http://host.com/Foo2/!User=123&SalesOrder=456&clavis=8def944e

The SalesOrder parameter is no longer prefixed by !, and so it is not included in the MAC that prevents tampering. This is mostly useful in GET submissions like search forms, and allows a developer to see at a glance which arguments require sanity checking.

The Data<T> type itself is fairly straightforward wrapper:

public struct Data<T> : IRef<T>, IPureData, IEquatable<Data<T>>
{
  /// 
  /// The encapsulated value.
  /// 
  public T Value { get; set; }
  ...
}

Using Clavis

You can obtain the Clavis binaries here. The Clavis API docs are here.

Simply link the Clavis and Sasa assemblies into your project, and put "using Clavis;" at the top of every file that's implementing or using continuations. I'm slowly migrating an old ASP.NET app to Clavis, demonstrating that it can be used incrementally in existing programs.

A few simple rules when using Clavis will maintain strong security at this layer of the stack:

  1. All state should preferably be transmitted via continuation parameters, ie. no use of the session or cookies. If you're careful, you can store credentials in the session token used by Clavis and remain safe, but only do this if you understand the serious dangers.
  2. Forgeable parameters should be used with extreme care, ie. Data<T>. In particular, don't use them to carry the authorization context or you are vulnerable to Confused Deputies, ie. a userId used to make access decisions should never be specified via Data<User>. Sanitize these parameters thoroughly.
  3. Use frameworks like Linq2Sql to ensure you're free of code injection vulnerabilities.

If you never use cookies or session state, your program will also automatically be RESTful. You still need to exercise care when sending content to clients to prevent XSS, but this happens at another layer that Clavis can't really help with.

Future Work

Clavis is a fairly simplistic approach to solving the issues I raised in this post, but this simplicity makes it reliable and easy to integrate with the existing ASP.NET stack. There are a few small limitations to Clavis as currently devised, some of which will be addressed shortly:

  1. The way type names are mapped to URLs might be changed to exploit ASP.NET's new and more flexible URL routing framework.
  2. There are only continuation types for up to 4 parameters. I intend to at least double that.
  3. Clavis currently depends minimally on the last stable release of Sasa. I will either eliminate this dependency, or update it when the new Sasa version is released shortly.

I welcome any comments or suggestions.

Footnotes

[1] Credentials in cookies are bundled separately from the URL to the resource being operated on, and URLs are typically forgeable -- classic Confused Deputy.