<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-2744072865491516720</id><updated>2012-02-02T21:10:52.233-05:00</updated><category term='expression problem'/><category term='LINQ'/><category term='reflection'/><category term='CLR'/><category term='Sasa'/><category term='STM'/><category term='C'/><category term='security'/><category term='EDSL'/><category term='mobile code'/><category term='benchmarks'/><category term='concurrency'/><category term='libraries'/><category term='type theory'/><category term='object oriented programming'/><category term='C#'/><category term='reactive programming'/><category term='pattern matching'/><category term='low-level programming'/><category term='tagless interpreters'/><category term='Ruby'/><category term='software'/><category term='Clavis'/><category term='functional programming'/><category term='NHibernate'/><category term='relational programming'/><category term='logic puzzles'/><category term='web programming'/><category term='virtual machines'/><category term='code generation'/><category term='Utilities'/><title type='text'>Higher Logics</title><subtitle type='html'>Where programming meets science.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>65</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-4426367783481304796</id><published>2012-02-02T21:10:00.000-05:00</published><updated>2012-02-02T21:10:52.245-05:00</updated><title type='text'>Diff for IEnumerable&lt;T&gt;</title><content type='html'>&lt;p&gt;I've just added a &lt;a href="http://sasa.hg.sourceforge.net/hgweb/sasa/sasa/file/c7cb8efa469e/Sasa/Linq/EnumerableDifference.cs"&gt;simple diff algorithm under Sasa.Linq&lt;/a&gt;. The signature is as follows:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// Compute the set of differences between two sequences.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The type of sequence items.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;/// &amp;lt;param name="original"&amp;gt;The original sequence.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;param name="updated"&amp;gt;The updated sequence to compare to.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;returns&amp;gt;&lt;br /&gt;/// The smallest sequence of changes to transform&lt;br /&gt;/// &amp;lt;paramref name="original"/&amp;gt; into &amp;lt;paramref name="updated"/&amp;gt;.&lt;br /&gt;/// &amp;lt;/returns&amp;gt;&lt;br /&gt;public static IEnumerable&amp;lt;Change&amp;lt;T&amp;gt;&amp;gt; Difference&amp;lt;T&amp;gt;(&lt;br /&gt;  this IEnumerable&amp;lt;T&amp;gt; original,&lt;br /&gt;  IEnumerable&amp;lt;T&amp;gt;      updated);&lt;br /&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// Compute the set of differences between two sequences.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The type of sequence items.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;/// &amp;lt;param name="original"&amp;gt;The original sequence.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;param name="updated"&amp;gt;The updated sequence to compare to.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;param name="eq"&amp;gt;The equality comparer to use.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;returns&amp;gt;The smallest sequence of changes to transform&lt;br /&gt;/// &amp;lt;paramref name="original"/&amp;gt; into &amp;lt;paramref name="updated"/&amp;gt;.&lt;br /&gt;/// &amp;lt;/returns&amp;gt;&lt;br /&gt;public static IEnumerable&amp;lt;Change&amp;lt;T&amp;gt;&amp;gt; Difference&amp;lt;T&amp;gt;(&lt;br /&gt;  this IEnumerable&amp;lt;T&amp;gt;  original,&lt;br /&gt;  IEnumerable&amp;lt;T&amp;gt;       updated,&lt;br /&gt;  IEqualityComparer&amp;lt;T&amp;gt; eq);&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;The extension methods depend only on the following enum and struct:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// Describes the type of change that was made.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;public enum ChangeType&lt;br /&gt;{&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// An item was added at the given position.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    Add,&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// An item was removed at the given position.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    Remove,&lt;br /&gt;}&lt;br /&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// Describes a change to a collection.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The collection item type.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;public struct Change&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// The change made at the given position.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public ChangeType ChangeType { get; internal set; }&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// The set of values added or removed from the given position.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public IEnumerable&amp;lt;T&amp;gt; Values { get; internal set; }&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// The position in the sequence where the change took place.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public int Position { get; internal set; }&lt;br /&gt;}&lt;/pre&gt;&lt;p&gt;This is a simple and general interface with which you can perform all sorts of computations on the differences between two sequences. The code as provided will work out of the box for any type T that implements equality. Some simple examples:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;Console.WriteLine( "miller".Difference("myers").Format("\r\n") );&lt;br /&gt;// prints out:&lt;br /&gt;// +1:y&lt;br /&gt;// -1:i,l,l&lt;br /&gt;// +6:s&lt;br /&gt;&lt;br /&gt;var original = new int[] { 2, 5, 99 };&lt;br /&gt;var updated = new int[] { 2, 4, 4, 8 };&lt;br /&gt;Console.WriteLine( original.Difference(updated).Format("\r\n") );&lt;br /&gt;// prints out:&lt;br /&gt;// +1:4,4,8&lt;br /&gt;// -1:5,99&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;"Format" is a simple extension method also under Sasa.Linq with generates a formatted string when given an IEnumerable&lt;/p&gt;&lt;p&gt;At the moment, I simply implemented the naive algorithm that takes N*M space and time. I plan to eventually implement some linear space optimizations, as described in &lt;a href="http://www.xmailserver.org/diff2.pdf"&gt;An O(ND) Difference Algorithm and Its Variations&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;There are many applications for a general difference algorithm like this. Consider a reactive property of type IEnumerable&amp;lt;T&amp;gt;, like as used in a drop down for a user interface. If the UI is remote, as you find in X11 or a web browser, sending the entire list over and over again is bandwidth-intensive, and trashes the latency of the UI. It's much more efficient to just send the changes, which can be accomplished by taking the diff of the original and the new list.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-4426367783481304796?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/4426367783481304796/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=4426367783481304796' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4426367783481304796'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4426367783481304796'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2012/02/diff-for-ienumerable.html' title='Diff for IEnumerable&amp;lt;T&amp;gt;'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-5626928517844207399</id><published>2012-01-08T09:32:00.000-05:00</published><updated>2012-01-09T21:22:45.553-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Clavis'/><category scheme='http://www.blogger.com/atom/ns#' term='web programming'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><title type='text'>Clavis - A Web Security Microframework</title><content type='html'>&lt;p&gt;Web programming has been a pretty big deal for over 10 years now, but in some ways the tools web developers use haven't really progressed that much, particularly when it comes to security. For instance, &lt;a href="http://en.wikipedia.org/wiki/Cross-site_request_forgery"&gt;CSRF&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Clickjacking"&gt;clickjacking&lt;/a&gt; are instances of the &lt;a href="http://en.wikipedia.org/wiki/Confused_deputy_problem"&gt;Confused Deputy&lt;/a&gt; problem, a security problem known since at least 1988.&lt;/p&gt;&lt;p&gt;Since these are both instances of the same underlying problem, in principle they should have the same or very similar solutions. However, the current solutions to these pervasive vulnerabilities are designed to solve only the specific problem at hand, and not the general problem of Confused Deputies. This means that if another HTML or Flash enhancement comes along that introduces another Confused Deputy, these solutions will not necessarily prevent exploitation of that vulnerability.&lt;/p&gt;&lt;p&gt;However, if we solve the underlying Confused Deputy problem, then that solution will address &lt;em&gt;all present and future Confused Deputies&lt;/em&gt;, assuming any new features don't violate the constraints the solution specifies. Fortunately, the solution to the Confused Deputy has been known since it was first recognized: &lt;a href="http://en.wikipedia.org/wiki/Object-capability_model"&gt;capabilities&lt;/a&gt;.&lt;/p&gt;&lt;h1&gt;Capabilities&lt;/h1&gt;&lt;p&gt;Capabilities are in fact a very simple and general framework for representing and reasoning about the flow of authority in a program. You've used capabilities many times without even knowing it. For instance, if you've ever received an activation e-mail with a long, unguessable URL, or if you've ever received a link to a Google Docs document.&lt;/p&gt;&lt;p&gt;There was no login requirement to load these links, no account information you had to input or password you had to provide. These types of links just bring you to the resource you want to access and whatever loads in your browser lets you know what you can do. Sometimes you can only read that resource, and sometimes there are some fields via which you can update the resource you're viewing, or other possibly update other resources. These types of URLs are capabilities.&lt;/p&gt;&lt;p&gt;The Confused Deputy problem inherently relies on references to resources being forgeable, like a file path, and the permissions for that resource being separate from the reference itself. The latter criterion is called "ambient authority" in capability parlance.&lt;/p&gt;&lt;p&gt;Capabilities solve the Confused Deputy because references are inherently unforgeable and include the permissions on that object. So if you have a capability, you have necessary and sufficient permission to operate on that object, period, full stop. I won't go into capabilities and Confused Deputies any further since there are plenty of online references describing them, but if there's enough interest I will write something up for a future post.&lt;/p&gt;&lt;p&gt;The quintessential capability web framework is &lt;a href="http://waterken.sourceforge.net/"&gt;Waterken&lt;/a&gt;, and its capabilities are called "web-keys".&lt;/p&gt;&lt;p&gt;Capabilities are not a panacea though. In the past, I found the Waterken programming model somewhat unintuitive. In my experience, most programming proceeds by designing a model that reflects the problem domain, which you then compose with models that cover other, often orthogonal, problem domains.&lt;/p&gt;&lt;p&gt;Example problem domains are data persistence, web request handling, and the usual models a programmer might have to write himself, like an order processing system. Most of these models are orthogonal, so each problem a developer needs to tackle can often be tackled independently, with a little glue to compose them into a final program.&lt;/p&gt;&lt;p&gt;Part of the problem with Waterken-style capabilities is that the data model of a program is more tightly coupled to the authorities as viewed by the user. These often take the form of capability patterns like &lt;a href="http://wiki.erights.org/wiki/Walnut/Secure_Distributed_Computing/Capability_Patterns#Facets"&gt;"facets"&lt;/a&gt;, which are used for attenuation, revocation and delegation of authority.&lt;/p&gt;&lt;p&gt;However, in traditional programming, security is most often modelled separately like every other concern, and evolves independently. This orthogonal thinking doesn't work with capabilities as they are currently provided. In Waterken, you have to compose the capability model and your data model &lt;em&gt;in your head&lt;/em&gt; before you can even write any code, and this is why it's a little more difficult than necessary in my opinion. Combined with the fact that Waterken already has a built-in notion of persistence, and it's not exactly a drop-in solution.&lt;/p&gt;&lt;p&gt;However, it's undeniable that capabilities provide a natural framework for preventing Confused Deputies, so a good web framework will probably look a bit like capabilities.&lt;/p&gt;&lt;h1&gt;Continuations&lt;/h1&gt;&lt;p&gt;In my opinion, the biggest advance on the server-side was the development of &lt;a href="http://en.wikipedia.org/wiki/Continuation#In_Web_development"&gt;continuations to model web interactions&lt;/a&gt;. If used correctly, they provide a natural framework for specifying and reasoning about the state needed by a request, and how that request can flow into other requests.&lt;/p&gt;&lt;p&gt;Every web program is unequivocally continuation-based, simply due to the stateless nature of HTTP and the various browser features, eg. back buttons. However, there have been many extensions to HTTP to try and mimic statefulness. Most of these extensions are still expressible as abstractions on top of continuations, although when these abstractions are used to carry credentials of various types, as is often done with cookies, a web program is susceptible to Confused Deputies [1].&lt;/p&gt;&lt;p&gt;Continuations are particularly advantageous in a statically typed language. A web framework in a typed language can define continuation abstractions specifying the static types of the parameters the continuation needs to satisfy a request. The framework itself can then guarantee that any requests for that resource have precisely the right types, and even the right names. This eliminates quite a bit of duplicate sanity checking code.&lt;/p&gt;&lt;p&gt;A good web framework will make the continuations inherent to web development somewhat obvious to maximize simplicity and flexibility.&lt;/p&gt;&lt;h1&gt;Summary of Problems&lt;/h1&gt;&lt;p&gt;Fundamentally, the problems with typical web frameworks are simple:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Pervasive statefulness, ie. sessions and cookies.&lt;/li&gt;&lt;li&gt;No standard, reliable way to protect parameters from tampering.&lt;/li&gt;&lt;li&gt;Authorization context is ambient, leading to confused deputies like CSRF and clickjacking.&lt;/li&gt;&lt;li&gt;No way to know at a glance which parameters require sanity checking, and which can be relied on.&lt;/li&gt;&lt;/ol&gt;&lt;h1&gt;Clavis&lt;/h1&gt;&lt;p&gt;Taking the above lessons to heart, I present here a simplistic web security microframework. It trivially integrates with ASP.NET, but does not inherently depend on the page model. System.Web.Page already naturally satisfies Clavis' requirements though, so you can get right to developing your program.&lt;/p&gt;&lt;p&gt;What Clavis provides, addressing some of the failings of typical approaches:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;A session-free, ambient authority-free state model via continuations.&lt;/li&gt;&lt;li&gt;A standard parameter protection semantics that prevents tampering, ie. parameters are unforgeable by default.&lt;/li&gt;&lt;li&gt;Authorization context is explicit as a parameter. Combined with unforgeability, this automatically prevents Confused Deputies like CSRF and clickjacking.&lt;/li&gt;&lt;li&gt;A parameter's forgeability is specified in the type and enforced by the compiler, so the need for sanity checking is obvious at a glance.&lt;/li&gt;&lt;/ol&gt;&lt;h2&gt;State Model - Continuations&lt;/h2&gt;&lt;p&gt;The core concept is that of a set of interfaces defining continuations with various parameters required to load the page:&lt;pre class="brush:csharp"&gt;public interface IContinuation : ...&lt;br /&gt;{&lt;br /&gt;}&lt;br /&gt;public interface IContinuation&amp;lt;T&amp;gt; : ...&lt;br /&gt;{&lt;br /&gt;}&lt;br /&gt;public interface IContinuation&amp;lt;T0, T1&amp;gt; : ...&lt;br /&gt;{&lt;br /&gt;}&lt;br /&gt;public interface IContinuation&amp;lt;T0, T1, T2&amp;gt; : ...&lt;br /&gt;{&lt;br /&gt;}&lt;br /&gt;public interface IContinuation&amp;lt;T0, T1, T2, T3&amp;gt; : ...&lt;br /&gt;{&lt;br /&gt;}&lt;br /&gt;...&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;A page class you define will then simply declare that it implements a certain continuation interface, and specify the types of the parameters the page requires to load:&lt;pre class="brush:csharp"&gt;public class Foo : System.Web.Page, IContinuation&amp;lt;User, SalesOrder&amp;gt;&lt;br /&gt;{&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;The requirements on IContinuation are already satisfied by System.Web.Page, so there is nothing to implement. A page can also implement multiple continuation types,  However, implementing the IContinuation interface gets you access to extension methods that allow you to name and parse parameters:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;public class Foo : System.Web.Page, IContinuation&amp;lt;User, SalesOrder&amp;gt;&lt;br /&gt;{&lt;br /&gt;  User user;&lt;br /&gt;  SalesOrder order;&lt;br /&gt;  protected void Page_Load(object sender, EventArgs e)&lt;br /&gt;  {&lt;br /&gt;    if (!this.TryParse0(out user, userId =&gt; LoadUser(userId)))&lt;br /&gt;      throw ArgumentException("Invalid request!");&lt;br /&gt;&lt;br /&gt;    if (!this.TryParse1(out order, orderId =&gt; LoadOrder(orderId)))&lt;br /&gt;      throw ArgumentException("Invalid request!");&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;You can generate URLs from a continuation based on the page's fully qualified type name:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;var link = Continuation.ToUrl&amp;lt;Foo, User, SalesOrder&amp;gt;(&lt;br /&gt;                        user.AsParam(), order.AsParam());&lt;/pre&gt;&lt;p&gt;If you just want to redirect to a continuation:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;Continuation.Display&amp;lt;Foo, User, SalesOrder&amp;gt;(&lt;br /&gt;             user.AsParam(), order.AsParam());&lt;/pre&gt;&lt;p&gt;For simplicity, Clavis requires that the fully qualified names of the continuations map to the URL path. So a continuation class Company.Warehouse.Foo, will map to ~/Company/Warehouse/Foo/. If the continuation is a System.Web.Page, you can simply place it under that directory and name it Default.aspx, and IIS will load it.&lt;/p&gt;&lt;p&gt;You can also customize the name used for a parameter if you really want to, although you must then keep the names used when generating the URL and the name used when parsing the parameter in sync.&lt;/p&gt;&lt;h2&gt;Unforgeable Parameters&lt;/h2&gt;&lt;p&gt;Each of a continuation's arguments are unforgeable, meaning that a URL constructed from a program is guaranteed to be free from tampering. A URL constructed for Foo above, might look something like this:&lt;/p&gt;&lt;pre&gt;http://host.com/Foo/?!User=123&amp;!SalesOrder=456&amp;clavis=6dc8ca2b&lt;/pre&gt;&lt;p&gt;Every query parameter prefixed with ! is guaranteed to be tamper proof, which means they can be used as unforgeable references. The unforgeability is guaranteed by the "clavis" query parameter, which is an HMAC of the URL path and the parameters prefixed by !.&lt;/p&gt;&lt;p&gt;By default, continuation arguments generate tamper-proof query parameters, with no further work on the developer's part. The developer simply needs to provide the functions to convert a type to and from a string that designates that type, ie. a UserId, a SalesOrderId, etc.&lt;/p&gt;&lt;p&gt;The query parameters are named according to the type of object that parameter is supposed to represent as specified in the IContinuation type definition. You should avoid using "int" or "string" since those aren't very meaningful. Foo from above generated "User" and "SalesOrder" parameters, but if I were instead write a class like this:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;public class Meaningless : System.Web.Page, IContinuation&amp;lt;int, string&amp;gt;&lt;br /&gt;{&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;p&gt;you'd end up with a URL that isn't very meaningful:&lt;/p&gt;&lt;p&gt;http://host.com/Meaningless/?!int=123&amp;!string=456&amp;clavis=0a1a15d0&lt;/p&gt;&lt;p&gt;So instead of specifying that your continuation accepts integers or strings that represent an object in your data model, just specify the data model objects in your continuation as I did with Foo above.&lt;/p&gt;&lt;p&gt;Generating a URL for Foo from above is probably a little more complicated than I showed before, since you will often need to specify a string conversion:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;var link = Continuation.ToUrl&amp;lt;Foo, User, SalesOrder&amp;gt;(&lt;br /&gt;                        user.AsParam(u =&amp;gt; u.UserId),&lt;br /&gt;                        order.AsParam(o =&amp;gt; o.OrderId));&lt;/pre&gt;&lt;p&gt;Loading the parameter on the receiving end depends on the position in the continuation argument list. The first parameter is loaded via TryParse0, the second via TryParse1, and so on.&lt;/p&gt;&lt;p&gt;In general, the AsParam() extension methods are used to generate URL parameters with string conversions, and TryParseX() extension methods are to load data model objects from the continuation arguments via conversion from string. You could also add your own extension methods for AsParam() so you don't have to specify the string conversion every time. It's not quite so easy for TryParseX since you need to provide an overload for each possible argument index.&lt;/p&gt;&lt;h2&gt;Explicit Authorization&lt;/h2&gt;&lt;p&gt;Since all state is intended to be passed via continuation arguments, this includes any user identification or other authorization information, like a UserId:&lt;/p&gt;&lt;pre&gt;http://host.com/Foo/?!User=123&amp;!SalesOrder=456&amp;clavis=6dc8ca2b&lt;/pre&gt;&lt;p&gt;Because the User parameter is unforgeable, this means the explicit authorization information needed to operate on SalesOrder 456 is guaranteed to be reliable. No attacker will be able to generate the correct "clavis" parameter to forge such a request.&lt;/p&gt;&lt;p&gt;With plain continuations and unforgeable references discussed so far, all URLs are straightforward capabilities. This means if someone ever accidentally finds a legitimate URL, they can access the content at that URL.&lt;/p&gt;&lt;p&gt;Consider being on the above page for Foo, and it contains an external link. If you click that link, the server receiving the request will get the above URL in its entirety, as do any HTTP caches along the way if the connection isn't encrypted. It's a hideous security leak. There are plenty of techniques to mitigate this problem, but it's a general symptom of capabilities. For instance, capability URLs are also susceptible to shoulder-surfing attacks.&lt;/p&gt;&lt;p&gt;Clavis addresses this by providing a mechanism for tying its intrinsic capability URLs to "sessions". A session is basically just like an authentication cookie you would use on a typical website, except it need not carry any meaningful credentials. It's merely another random token that is included in the MAC so resulting URL is unique and tied to the lifetime of that token, and all such capability "leaks" are automatically plugged so long as the token is not also leaked.&lt;/p&gt;&lt;p&gt;The URL generated for a session-limited continuation will look like:&lt;/p&gt;&lt;pre&gt;http://host.com/Foo/?!User=123&amp;!SalesOrder=456&amp;clavis-auth=faee8ce2&lt;/pre&gt;&lt;p&gt;Notice that the "clavis" parameter is now called "clavis-auth", so Clavis knows that this request should include a session token in the MAC.&lt;/p&gt;&lt;p&gt;Clavis is agnostic about how this token is generated and stored on the server. You don't even have to store it at all, technically. The token can technically be anything, and you can even just use ASP.NET's FormsAuthentication cookies, or the ASP.NET session id for the token. This configuration happens on application load:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;Continuation.Init(Convert.FromBase64String(clavisKey), () =&amp;gt;&lt;br /&gt;{&lt;br /&gt;    var x = HttpContext.Current.Session;&lt;br /&gt;    return x == null ? "" : x.SessionID;&lt;br /&gt;});&lt;/pre&gt;&lt;p&gt;"clavisKey" in the above is a 64-bit random byte array used as the private key for generating the MAC. Storing it as a parameter in the web.config file would do for most purposes. If you ever change this key, all previous URLs MAC'd with that key will break, but sometimes that's what you need. For instance, if this key is ever disclosed, you want to change it immediately to eliminate the threat.&lt;/p&gt;&lt;h2&gt;Sanity Checked Parameters&lt;/h2&gt;&lt;p&gt;By default, continuation arguments are unforgeable, so they don't require any sanity checking. This is unnecessarily restrictive however, since there are many HTTP patterns where we want the &lt;em&gt;client&lt;/em&gt; to provide or modify some URL parameters. Ideally, we could easily identify such parameters so we can tell at a glance which ones need sanity checking.&lt;/p&gt;&lt;p&gt;In Clavis, this is achieved via Data&amp;lt;T&amp;gt;. A continuation whose argument type is wrapped in Data&amp;lt;T&amp;gt; will generate a &lt;em&gt;forgeable&lt;/em&gt; query parameter, not an unforgeable one:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;public class Foo2 : System.Web.Page,&lt;br /&gt;                    IContinuation&amp;lt;User, Data&amp;lt;SalesOrder&amp;gt;&amp;gt;&lt;br /&gt;{&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;p&gt;This indicates that the second argument of type SalesOrder is pure data, and not unforgeable, and so it requires strict sanity checking before being used. Foo2 will generate URL that looks like:&lt;/p&gt;&lt;pre&gt;http://host.com/Foo2/!User=123&amp;SalesOrder=456&amp;clavis=8def944e&lt;/pre&gt;&lt;p&gt;The SalesOrder parameter is no longer prefixed by !, and so it is not included in the MAC that prevents tampering. This is mostly useful in GET submissions like search forms, and allows a developer to see at a glance which arguments require sanity checking.&lt;/p&gt;&lt;p&gt;The Data&amp;lt;T&amp;gt; type itself is fairly straightforward wrapper:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;public struct Data&amp;lt;T&amp;gt; : IRef&amp;lt;T&amp;gt;, IPureData, IEquatable&amp;lt;Data&amp;lt;T&amp;gt;&amp;gt;&lt;br /&gt;{&lt;br /&gt;  /// &lt;summary&gt;&lt;br /&gt;  /// The encapsulated value.&lt;br /&gt;  /// &lt;/summary&gt;&lt;br /&gt;  public T Value { get; set; }&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;h1&gt;Using Clavis&lt;/h1&gt;&lt;p&gt;You can obtain the &lt;a href="http://higherlogics.net/software/clavis/"&gt;Clavis binaries here&lt;/a&gt;. The Clavis &lt;a href="http://higherlogics.net/software/clavis/doc/"&gt;API docs are here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Simply link the Clavis and Sasa assemblies into your project, and put "using Clavis;" at the top of every file that's implementing or using continuations. I'm slowly migrating an old ASP.NET app to Clavis, demonstrating that it can be used incrementally in existing programs.&lt;/p&gt;&lt;p&gt;A few simple rules when using Clavis will maintain strong security at this layer of the stack:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;All state should preferably be transmitted via continuation parameters, ie. no use of the session or cookies. If you're careful, you can store credentials in the session token used by Clavis and remain safe, but only do this if you understand the serious dangers.&lt;/li&gt;&lt;li&gt;Forgeable parameters should be used with extreme care, ie. Data&amp;lt;T&amp;gt;. In particular, don't use them to carry the authorization context or you are vulnerable to Confused Deputies, ie. a userId used to make access decisions should never be specified via Data&amp;lt;User&amp;gt;. Sanitize these parameters thoroughly.&lt;/li&gt;&lt;li&gt;Use frameworks like Linq2Sql to ensure you're free of code injection vulnerabilities.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;If you never use cookies or session state, your program will also automatically be RESTful. You still need to exercise care when sending content to clients to prevent XSS, but this happens at another layer that Clavis can't really help with.&lt;/p&gt;&lt;h1&gt;Future Work&lt;/h1&gt;&lt;p&gt;Clavis is a fairly simplistic approach to solving the issues I raised in this post, but this simplicity makes it reliable and easy to integrate with the existing ASP.NET stack. There are a few small limitations to Clavis as currently devised, some of which will be addressed shortly:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;The way type names are mapped to URLs might be changed to exploit ASP.NET's new and more flexible URL routing framework.&lt;/li&gt;&lt;li&gt;There are only continuation types for up to 4 parameters. I intend to at least double that.&lt;/li&gt;&lt;li&gt;Clavis currently depends minimally on the last stable release of &lt;a href="https://sourceforge.net/projects/sasa/"&gt;Sasa&lt;/a&gt;. I will either eliminate this dependency, or update it when the new Sasa version is released shortly.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;I welcome any comments or suggestions.&lt;/p&gt;&lt;h2&gt;Footnotes&lt;/h2&gt;&lt;p&gt;[1] Credentials in cookies are bundled separately from the URL to the resource being operated on, and URLs are typically forgeable -- classic Confused Deputy.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-5626928517844207399?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/5626928517844207399/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=5626928517844207399' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5626928517844207399'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5626928517844207399'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2012/01/clavis-web-security-microframework.html' title='Clavis - A Web Security Microframework'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8337334537840042974</id><published>2011-12-23T17:18:00.000-05:00</published><updated>2011-12-23T14:19:53.725-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='STM'/><category scheme='http://www.blogger.com/atom/ns#' term='reactive programming'/><category scheme='http://www.blogger.com/atom/ns#' term='concurrency'/><title type='text'>Software Transactional Memory in Pure C#</title><content type='html'>Concurrent programming is a very difficult problem to tackle. The fundamental issue is that manual locking is not &lt;em&gt;composable&lt;/em&gt;, which is to say that if you have two concurrent programs P0 and P1 free of deadlocks, livelocks and other concurrency hazards, and you try to compose P0 and P1 to create a program P2, P2 may not be free of concurrency hazards. For instance, if P0 and P1 take two locks in different orders, then P2 will deadlock. Needless to say, this is a serious problem because composition is the cornerstone of all programming.&lt;p&gt;I've been toying with some ideas for &lt;a href="http://en.wikipedia.org/wiki/Software_transactional_memory"&gt;software transactional memory (STM)&lt;/a&gt; in C# ever since I started playing with &lt;a href="http://en.wikipedia.org/wiki/Functional_reactive_programming"&gt;FRP&lt;/a&gt; and reactive programming in general. The problem in all of these domains is fundamentally about how to handle concurrent updates to shared state, and how to reconcile multiple, possibly conflicting updates to said state.&lt;/p&gt;&lt;p&gt;Rx.NET handles concurrency essentially by removing the identity inherent to shared state. An IObservable&amp;lt;T&amp;gt; is actually a collection of &lt;em&gt;all&lt;/em&gt; values pushed to that observable in some undefined order. If you were to create an IObservable that retains only the "last" pushed value, and thus now retains an identity, you then have the same problems as above, namely that this update must always be consistent with other updates at any given instant in time. For instance:&lt;/p&gt;&lt;pre class="brush: csharp"&gt;var plusOne = intObservable.Select(i =&amp;gt; i+1);&lt;/pre&gt;&lt;p&gt;At every instant in the program's execution, &lt;code&gt;plusOne&lt;/code&gt; should always observably equal &lt;code&gt;intObservable + 1&lt;/code&gt;, and the ability to observe a violation of this constraint is known in reactive literature as a 'glitch'.&lt;/p&gt;&lt;p&gt;Similarly, in database programming where transactions rule, this is known as a 'dirty read'. Essentially, an update to &lt;code&gt;intObservable&lt;/code&gt; is executing in a transaction, but other transactions are able to view those changes before that transaction has committed.&lt;/p&gt;&lt;p&gt;Generally speaking, glitches and dirty reads are undesirable, because they require the developer to manually synchronize state, which defeats the whole purpose of going with FRP or transactions to begin with. From what I've seen so far, Rx.NET gets around this by not providing abstractions that expose identity in this way. The programs you write must work with collections of values, and the program must specify the ordering via Observable.OrderBy.&lt;/p&gt;&lt;p&gt;When I added the &lt;a href="http://sasa.hg.sourceforge.net/hgweb/sasa/sasa/file/9ac61e30fa69/Sasa.Reactive/Property.cs"&gt;Property&amp;lt;T&amp;gt;&lt;/a&gt; IObservable to Sasa, I added &lt;a href="http://sasa.hg.sourceforge.net/hgweb/sasa/sasa/file/6ceb4b81b5b7/Sasa.Reactive"&gt;a limited form of transactions&lt;/a&gt; to prevent glitches, because a property has identity. This implementation uses a global 'clock', which is really just a global uint64 counter to properly sequence updates and prevent glitches.&lt;/p&gt;&lt;h1&gt;Overview&lt;/h1&gt;&lt;p&gt;I'm going to focus here on implementing STM directly, but to keep it simple, I've gone with the simplest STM that is expressible using .NET primitives. In fact, the resulting STM is probably not good if you're after scaling, it does a good job of ensuring &lt;em&gt;concurrency safety&lt;/em&gt; for arbitrary composition.&lt;/p&gt;&lt;p&gt;The STM I committed &lt;a href="http://sasa.hg.sourceforge.net/hgweb/sasa/sasa/file/9ac61e30fa69/Sasa.TM"&gt;to Sasa&lt;/a&gt; is a very simple, perhaps even simplistic, STM employing encounter-time locking with deadlock detection on transactional variables. Any read or write acquires the lock on a transactional variable. Whenever two transactions would block to wait for each other, the transaction that is not already blocked is aborted and retried.&lt;/p&gt;&lt;p&gt;This design has advantages and disadvantages. The disadvantages are the limited concurrency even when reads and writes would not conflict. Two transactions that only read a transactional variable Y, would still block each other despite the fact that concurrent reads can't cause problems. Furthermore, the use of encounter-time locking means that locks can be held for a long time. Finally, the naive deadlock detection combined with encounter time locking means that some programs will have higher abort rates than they would in other STMs.&lt;/p&gt;&lt;p&gt;There are significant advantages to this approach though. For one, a transaction doesn't require elaborate read/write/undo logs. In fact, this STM requires only a single allocation for the transaction object itself at transaction start. By contrast, most other STM designs require at least one allocation for every object that is read or written. These allocation costs are generally  amortized, but they still add up.&lt;/p&gt;&lt;p&gt;The STM is also conceptually simple at 450 lines of code, including elaborate comments (127 lines counting only semicolons). This STM consists of only 3 classes, and 1 exception, and uses only System.Monitor for locking. This means that the STM isn't really fair, but it's rather simple to replace standard locks with &lt;a href="http://higherlogics.blogspot.com/2011/10/most-general-concurrent-api.html"&gt;a fair locking&lt;/a&gt; once the core STM algorithm is understood.&lt;/p&gt;There is also preliminary support for integration with System.Transactions.&lt;h1&gt;Transactional Programming&lt;/h1&gt;&lt;p&gt;Any sort of transactional programming requires a transaction:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;public sealed class MemoryTransaction : IEnlistmentNotification,&lt;br /&gt;                                        IDisposable&lt;br /&gt;{&lt;br /&gt;  public static MemoryTransaction Begin();&lt;br /&gt;  public static void Run(Action body);&lt;br /&gt;  public static MemoryTransaction Current { get; }&lt;br /&gt;  public void Complete();&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;This class is closely modeled on the design of &lt;a href="http://msdn.microsoft.com/en-us/library/system.transactions.transactionscope.aspx"&gt;TransactionScope&lt;/a&gt; from System.Transactions. Programs will generally concern themselves mostly with transactional variables, which in Sasa.TM is called Transacted&amp;lt;T&amp;gt;:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;public class Transacted&amp;lt;T&amp;gt; : Participant, IRef&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  public T Value { get; set; }&lt;br /&gt;  public void Write(T value, MemoryTransaction transaction);&lt;br /&gt;  public T Read(MemoryTransaction transaction);&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;Any reads and writes to Transacted&amp;lt;T&amp;gt; occur within the lifetime of a MemoryTransaction, and the set of all such reads and writes are committed atomically. A simple program demonstrating the use of these abstractions:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;Transacted&amp;lt;int&amp;gt; accountBalance = new Transacted&amp;lt;int&amp;gt;();&lt;br /&gt;MemoryTransaction.Run(() =&gt;&lt;br /&gt;{&lt;br /&gt;  accountBalance.Value += 100;&lt;br /&gt;});&lt;br /&gt;&lt;/pre&gt;&lt;p&gt;&lt;a href="http://higherlogics.net/sasa/docs-head/Contents/4/438.html"&gt;MemoryTransaction.Run&lt;/a&gt; will handle all the commits, rollbacks and retries for you. You can do this manually as well if you catch &lt;a href="http://higherlogics.net/sasa/docs-head/Contents/4/426.html"&gt;RetryException&lt;/a&gt;, and call Complete and Dispose methods on the transaction manually, but for most purposes the Run method suffices. You can nest calls to Run as many times as you like, but only one top-level transaction will ever be created.&lt;/p&gt;&lt;p&gt;No matter how many concurrent threads are executing the above code, it will always be updated atomically, and you can compose the above program with any other transactional program, and the result will also be free of concurrency hazards. The one caveat is that you should not cause non-transactional side-effects from within a transaction.&lt;/p&gt;&lt;p&gt;Please refer to the &lt;a href="http://higherlogics.net/sasa/docs-head/"&gt;API docs under Sasa.TM&lt;/a&gt; for further details.&lt;/p&gt;&lt;h1&gt;Internals&lt;/h1&gt;&lt;p&gt;The internals of this STM design is pretty straightforward. Structurally, it looks something like this:&lt;/p&gt;&lt;pre style="line-height:0.8; background-color: white;"&gt;+---------------+   +---------------+    +----------------+&lt;br /&gt;|Tx0            |   |Transacted0    |    |Transcated1     |&lt;br /&gt;|  participants----&gt;|  value = 2    | +-&gt;|  value = true  |&lt;br /&gt;|  waitingFor   |   |  undo  = 1    | |  |  undo  = false |&lt;br /&gt;+-----|---------+   |  next-----------+  |  next  = null  |&lt;br /&gt;      |       ^     |  owner        |    |  owner         |&lt;br /&gt;      |       |     +---|-----------+    +---|------------+&lt;br /&gt;      |       |         |                    |&lt;br /&gt;      |       +------------------------------+&lt;br /&gt;      |&lt;br /&gt;      +--------------------+&lt;br /&gt;                           |&lt;br /&gt;                           v&lt;br /&gt;+------------------+     +------------------+&lt;br /&gt;|Tx1               |     |Transacted2       |&lt;br /&gt;|  participants---------&gt;|  value = null    |&lt;br /&gt;|  waitingFor=null |     |  undo  = "Foo"   |&lt;br /&gt;+------------------+&lt;-------owner           |&lt;br /&gt;                         +------------------+&lt;/pre&gt;&lt;p&gt;There's quite a bit going on here, so here are some quick highlights:&lt;ul&gt;&lt;li&gt;Each Transacted&amp;lt;T&amp;gt; is a member of a linked list rooted in the "participants" field of the MemoryTransaction. The list consists of Transacted&amp;lt;T&amp;gt; which have been read or written during the current transaction.&lt;/li&gt;&lt;li&gt;Each Transacted&amp;lt;T&amp;gt; points to the current transaction that owns its lock.&lt;/li&gt;&lt;li&gt;Each MemoryTransaction that attempts to acquire a lock on a Transacted&amp;lt;T&amp;gt;, stores the Transacted&amp;lt;T&amp;gt; in a local field called "waitingFor".&lt;/li&gt;&lt;li&gt;Transacted&amp;lt;T&amp;gt; stores the original value before any changes are made, so we can rollback if the transaction aborts.&lt;/li&gt;&lt;/ul&gt;&lt;/p&gt;&lt;p&gt;From the above graph, we can see that there are two running transactions, Tx0 and Tx1, and that Tx0 has read or written Transacted0 and Transacted1, and it has tried to read/write Transacted2. However, Tx1 currently owns the lock on Transacted2, so Tx0 is effectively blocked waiting for Tx1 to complete.&lt;/p&gt;&lt;p&gt;This dependency graph is acyclic so there is no deadlock. If Tx1 were to then try to acquire the lock on Transacted0 or Transacted1, we would create a cycle in the waits-for graph, and we would have to abort one of the transactions.&lt;/p&gt;&lt;p&gt;On commit, a transaction's participant list is walked, unlinking elements as it goes, and all the undo fields are cleared and the locks are released. The next transaction blocked on any of the participants, acquires the lock it's been waiting for, sets the owner field, and proceeds.&lt;/p&gt;&lt;p&gt;Rollback is much the same, except the Transacted&amp;lt;T&amp;gt;'s value field is first overwritten with the value from the undo field.&lt;/p&gt;&lt;h1&gt;Future Work&lt;/h1&gt;&lt;h2&gt;Fair STM&lt;/h2&gt;&lt;p&gt;To those that have read my previous posts, note that the structure of the MemoryTransaction is exactly the structure of &lt;a href="http://higherlogics.blogspot.com/2011/10/most-general-concurrent-api.html"&gt;MetaThread from a previous post&lt;/a&gt;. By simply adding a WaitHandle to MemoryTransaction with a FIFO locking protocol, we have a fair STM.&lt;/p&gt;&lt;h3&gt;Lock Stealing&lt;/h3&gt;&lt;p&gt;STM research so far has shown that most transactions are short enough that they can execute in a single timeslice, and throughput suffers if a thread is descheduled while it's holding locks. This would only be exacerbated in an encounter-time locking design like I've described here, since locks are held for longer.&lt;/p&gt;&lt;p&gt;Instead of blocking on a variable that is already owned, we can instead steal the lock under certain conditions. For instance, if Tx0 and Tx1 are merely reading from a variable, they can repeatedly steal the lock from each other without concern.&lt;/p&gt;&lt;p&gt;A transaction that writes a variable that has only been locked for reading, can steal that lock too, but if the original owner tries to read the variable again, it must abort.&lt;/p&gt;&lt;p&gt;If Tx0 and Tx1 both try to write the same variable, blocking is unavoidable.&lt;/p&gt;&lt;p&gt;Obviously, all of these performance improvements impact the simplicity of the original design, so I'm leaving them for future work if the need arises.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8337334537840042974?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8337334537840042974/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8337334537840042974' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8337334537840042974'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8337334537840042974'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2011/09/software-transactional-memory-in-pure-c.html' title='Software Transactional Memory in Pure C#'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-778233070128924309</id><published>2011-11-12T16:57:00.001-05:00</published><updated>2011-11-13T12:21:09.152-05:00</updated><title type='text'>Type Unification Forbidden - More C#/CLR Irritations</title><content type='html'>&lt;p&gt;I've written about quite a few irritations of C#/CLR, including &lt;a href="http://higherlogics.blogspot.com/2010/05/asymmetries-in-cil_17.html"&gt;asymmetries in CIL&lt;/a&gt;, &lt;a href="http://higherlogics.blogspot.com/2010/05/cil-verification-and-safety.html"&gt;oddly unverifiable CIL instructions&lt;/a&gt;, &lt;a href="http://higherlogics.blogspot.com/2010/12/sasa-v093-released.html"&gt;certain type constraints are forbidden for no reason&lt;/a&gt;, &lt;a href="http://higherlogics.blogspot.com/2011/08/open-instance-delegate-for-generic.html"&gt;delegate creation bugs&lt;/a&gt;, the lack of &lt;a href="http://higherlogics.blogspot.com/2009/10/abstracting-over-type-constructors.html"&gt;higher-kinded types&lt;/a&gt;, &lt;a href="http://higherlogics.blogspot.com/2011/09/iobservable-and-delegate-equality.html"&gt;equality asymmetries between events and IObservable&lt;/a&gt;, &lt;a href="http://higherlogics.blogspot.com/2007/05/gexl-lives.html"&gt;generics/type parameter problems&lt;/a&gt;, &lt;a href="http://stackoverflow.com/questions/7692820/c-sharp-fixed-inline-arrays"&gt;lack of usable control over object layouts&lt;/a&gt;, and just &lt;a href="http://higherlogics.blogspot.com/2007/05/whats-wrong-with-net.html"&gt;overall limitations of the CLR VM&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;I've just smack into yet another annoying problem: &lt;a href="http://stackoverflow.com/questions/7664790/why-does-the-c-sharp-compiler-complain-that-types-may-unify-when-they-derive-f"&gt;type parameter unification is forbidden&lt;/a&gt;. &lt;a href="http://connect.microsoft.com/VisualStudio/feedback/details/91817/allow-to-implement-same-generic-interface-for-more-that-one-type-parameter-in-generic-class-under-some-conditions"&gt;The Microsoft Connect bug&lt;/a&gt; filed in &lt;em&gt;2004&lt;/em&gt; was closed as &lt;em&gt;By Design&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;Here's a simple code fragment demonstrating the problem:&lt;pre class="brush:csharp"&gt;class ObserveTwo&amp;lt;T0, T1&amp;gt; : IObservable&amp;lt;T0&amp;gt;, IObservable&amp;lt;T1&amp;gt;&lt;br /&gt;{&lt;br /&gt;}&lt;/pre&gt;This will fail with the error:&lt;pre&gt;'ObserveTwo&amp;lt;T0,T1&amp;gt;' cannot implement both 'System.IObservable&amp;lt;T0&amp;gt;' and 'System.IObservable&amp;lt;T1&amp;gt;' because they may unify for some type parameter substitutions&lt;/pre&gt;This is frankly nonsense. What's the problem if T0 and T1 unify? If T0=T1, ObserveTwo just implements IObservable&amp;lt;T0&amp;gt;, and the methods are all unified as well [2]. If T0!=T1, then the usual semantics apply.&lt;/p&gt;&lt;p&gt;The MS Connect bug implies that a type safety issue, but there's no problem that I can see [1].&lt;/p&gt;&lt;p&gt;This is incredibly frustrating, because interfaces are the only way to design certain extension methods (mixin-style). For instance, suppose we have a set of simple interfaces:&lt;pre class="brush:csharp"&gt;// marks a class as containing a parseable parameter of type T&lt;br /&gt;public interface IParseable&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;   HttpRequest Request { get; }&lt;br /&gt;}&lt;br /&gt;// continuation parameters are parseable&lt;br /&gt;public interface IContinuation&amp;lt;T&amp;gt; : IParseable&amp;lt;T&amp;gt; { }&lt;br /&gt;public interface IContinuation&amp;lt;T0, T1&amp;gt; : IParseable&amp;lt;T0&amp;gt;, IParseable&amp;lt;T1&amp;gt; { }&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;This is a perfectly sensible set of type definitions, and there is no ambiguity or type safety problem here, even if a class were to implement IContinuation&amp;lt;int, int&amp;gt;.&lt;p&gt;If anyone has any suggestions for workarounds, I'm all ears. Options that don't work:&lt;ol&gt;&lt;li&gt;Can't turn IParseable into a struct/class with an implicit conversion, because implicit conversions don't work on interfaces.&lt;/li&gt;&lt;li&gt;Can't define MxN overloads of the extension methods defined on IParseable, where N=number of IContinuation type definitions, and M=number of type parameters (although this actually has a shot of working, the code duplication is just ludicrous).&lt;/li&gt;&lt;/ol&gt;&lt;/p&gt;&lt;h1&gt;A Partial Solution&lt;/h1&gt;&lt;p&gt;The best solution I have involves N overloads of the extension methods by implementing interfaces that designate type parameter positions:&lt;pre class="brush:csharp"&gt;// type param at index 0&lt;br /&gt;public interface IParam0&amp;lt;T&amp;gt; { }&lt;br /&gt;// type param at index 1&lt;br /&gt;public interface IParam1&amp;lt;T&amp;gt; { }&lt;br /&gt;// type param at index 2&lt;br /&gt;public interface IParam2&amp;lt;T&amp;gt; { }&lt;br /&gt;...&lt;br /&gt;// continuations  have indexed type parameters&lt;br /&gt;public interface IContinuation&amp;lt;T&amp;gt; : IParam0&amp;lt;T&amp;gt; { }&lt;br /&gt;public interface IContinuation&amp;lt;T0, T1&amp;gt; : IParam0&amp;lt;T0&amp;gt;, IParam1&amp;lt;T1&amp;gt; { }&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;Now instead of defining extension methods on IParseable, I define an overload per IParamX interface (N overloads, 1 per type parameter).&lt;/p&gt;&lt;p&gt;This necessarily causes an ambiguity when two type parameters unify, since the compiler won't know whether you wanted to call the extension on IParam0&amp;lt;int&amp;gt; or IParam2&amp;lt;int&amp;gt;, but this ambiguity happens at the &lt;em&gt;call site/at type instantiation&lt;/em&gt;, instead of &lt;em&gt;type definition&lt;/em&gt;. This is a much better default [1], because you can actually do something about it by disambiguating manually. I've eliminated ambiguity entirely in my library by appending the type parameter index to the extension method name.&lt;/p&gt;&lt;p&gt;Of course, none of this boilerplate would have been necessary if type unification were supported.&lt;/p&gt;&lt;p&gt;[1] If the CLR supported type &lt;em&gt;inequalities&lt;/em&gt;, instead of just type equalities, then we could just forbid this at the point ObserveTwo is created.&lt;/p&gt;&lt;p&gt;[2] EDIT 2011-11-13T11:45 AM: there's been some confusion about this statement. Doug McClean rightly points out that there's no guarantee that the two implementations are confluent, so unifying is not an "automatic" thing, so please don't take this to mean that I'm saying the compiler should be able to automatically "merge" methods somehow, or magically know which implementation to dispatch to based on context. This is undecidable in general.&lt;/p&gt;&lt;p&gt;Also, even if the interfaces require no implementation, the error still occurs, despite reduction being Church-Rosser. Finally, plenty of nice properties are violated on the CLR, so I'm not sure why also sacrificing confluence is such a big deal here. A sensible dynamic semantics, like always dispatching to the implementation for T0 in case of type parameter unification, restores most of the desirable properties without sacrificing expressiveness, and without going whole-hog and ensuring confluence via type inequalities [1].&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-778233070128924309?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/778233070128924309/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=778233070128924309' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/778233070128924309'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/778233070128924309'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2011/11/type-unification-forbidden-more-cclr.html' title='Type Unification Forbidden - More C#/CLR Irritations'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8465584751386909415</id><published>2011-11-01T16:34:00.000-04:00</published><updated>2011-11-12T11:24:40.758-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pattern matching'/><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='EDSL'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='object oriented programming'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>Ad-hoc Extensions in .NET</title><content type='html'>&lt;p&gt;A &lt;a href="http://lambda-the-ultimate.org/node/4389#comment-68003"&gt;recent discussion on LtU&lt;/a&gt; brought up a common limitation of modern languages and runtimes. Consider some set of abstractions A, B, ..., Z that we wish supported some custom operation Foo(). Languages like C# give only two straightforward possibilities: inheritance and runtime type tests and casts.&lt;/p&gt;&lt;p&gt;Inheritance is straightforward: we simply inherit from each A, B, ..., Z to make A_Foo, B_Foo, ..., Z_Foo, with a custom Foo() method.&lt;/p&gt;&lt;p&gt;Unfortunately, inheritance is sometimes forbidden in C#, and furthermore, it sometimes prevents us from integrating with existing code. For instance, say we want to integrate with callback-style code, where the existing code hands our program an object of type A. We can't call Foo() on this A, because it's not an A_Foo, which is what we really wanted.&lt;/p&gt;&lt;p&gt;This leaves us with the undesirable option of using a large set of runtime type tests. Now type tests and casts are &lt;a href="http://higherlogics.blogspot.com/2008/10/vtable-dispatching-vs-runtime-tests-and.html"&gt;actually pretty fast&lt;/a&gt;, faster than virtual dispatch in fact, but the danger is in missing a case and thus triggering a runtime exception, a danger inheritance does not have.&lt;/p&gt;&lt;p&gt;Furthermore, we'd have to repeat the large set of runtime type tests for each operation we want to add to A, B, ... Z, which means every time we want to add an operation we have to do a large copy-paste, and every time we add a new class we have to update every place we test types, and the compiler doesn't warn us when we're missing a case.&lt;/p&gt;&lt;h1&gt;A Solution&lt;/h1&gt;&lt;p&gt;Fortunately, there's a straightforward way to address at least part of this problem. We can collect all the type tests we have to do in one place, and we can ensure that we have properly handled all cases that we are aware of. To do this, we combine the standard visitor pattern with the CLR's open instance delegates and its unique design for generics.&lt;/p&gt;&lt;p&gt;Here are the types:&lt;pre class="brush:csharp"&gt;// the interface encapsulating the operation to implement, ie. Foo()&lt;br /&gt;public interface IOperation&lt;br /&gt;{&lt;br /&gt;  void A(A obj);&lt;br /&gt;  void B(B obj);&lt;br /&gt;  ...&lt;br /&gt;  void Z(Z obj);&lt;br /&gt;  void Unknown(object obj);&lt;br /&gt;}&lt;br /&gt;// the static dispatcher where runtime tests occur&lt;br /&gt;public static class Dispatcher&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  static Action&amp;lt;IOperation, T&amp;gt; cached;&lt;br /&gt;  public static Action&amp;lt;IOperation, T&amp;gt; Dispatch&lt;br /&gt;  {&lt;br /&gt;    get { return cached ?? (cached = Load()); }&lt;br /&gt;  }&lt;br /&gt;  static Action&amp;lt;IOperation, T&amp;gt; Load()&lt;br /&gt;  {&lt;br /&gt;    var type = typeof(T);&lt;br /&gt;    var mName = type == typeof(A) ? "A":&lt;br /&gt;                type == typeof(B) ? "B":&lt;br /&gt;                        ...&lt;br /&gt;                                  ? "Unknown";&lt;br /&gt;    var method = typeof(IOperation).GetMethod(method);&lt;br /&gt;    return Delegate.CreateDelegate(typeof(Action&amp;lt;IOperation, T&amp;gt;), null, method)&lt;br /&gt;             as Action&amp;lt;IOperation, T&amp;gt;;&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;If you want to extend A, B, ... Z, with any operation, you simply create a class that implements IOperation and the compiler will ensure you handle all known cases. If you want to add a class to the set of handled cases, you simply extend IOperation and add a case to the runtime type tests (or if you use a naming convention, you can just use the class name as the method name).&lt;/p&gt;&lt;p&gt;The dynamic type test runs once, and then a delegate that directly calls into the IOperation is cached, so the type tests are not run again.&lt;/p&gt;&lt;p&gt;If you try to dispatch on an unknown type, IOperation.Unknown is invoked. I could have made this a generic method, but &lt;a href="http://higherlogics.blogspot.com/2011/08/open-instance-delegate-for-generic.html"&gt;the CLR currently has a bug creating open instance delegates to generic interface methods&lt;/a&gt;, so that would require some code generation to do properly.&lt;/p&gt;&lt;p&gt;There is a caveat though: if any of the cases are subtypes of each other, and you dispatch on the static base type, it will dispatch to the base type and not the super type handler. For instance, if A:B, and you call Dispatcher&amp;lt;B&amp;gt;.Dispatch(new A()), IOperation.B is called, not IOperation.A. This can be handled in various ways, but it's beyond the scope of this article. I may post a follow-up discussing the various options.&lt;/p&gt;&lt;h1&gt;Extensions&lt;/h1&gt;&lt;h2&gt;Type Constraints on T&lt;/h2&gt;&lt;p&gt;If all types to handle are subtypes of type X,  then it's simple to constrain the type parameter T on Dispatcher&amp;lt;T&amp;gt;:&lt;pre class="brush: csharp"&gt; public static class Dispatcher&amp;lt;T&amp;gt;&lt;br /&gt;  where T : X&lt;br /&gt;{&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;/p&gt;&lt;h2&gt;Operation Return Type&lt;/h2&gt;&lt;p&gt;You could extend IOperation with a return type, but this requires propagating the return type parameter into Dispatcher and the cached delegate type, and thus requires more runtime state for more static fields. This is because the CLR &lt;a href="http://higherlogics.blogspot.com/2007/05/gexl-lives.html"&gt;doesn't support GADTs&lt;/a&gt;, so following &lt;a href="http://lambda-the-ultimate.org/node/2232"&gt;this paper&lt;/a&gt;, I prefer to keep the return type encapsulated in IOperation and expose it as a public field for the caller to extract the value:&lt;pre class="brush:csharp"&gt;public sealed class Foo : IOperation&lt;br /&gt;{&lt;br /&gt;  // the return value&lt;br /&gt;  public Bar ReturnValue { get; set; }&lt;br /&gt;  public void A(A obj) { ... }&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;This requires fewer type parameters, and keeps the implementation simple.&lt;/p&gt;&lt;h2&gt;Per-Class Overrides&lt;/h2&gt;&lt;p&gt;Suppose some of the classes you are handling are your own, or are already aware of your code such that they have already implemented your custom operations. In that case, you can specify a companion interface to indicate this, and modify the dispatch code to invoke the interface instead:&lt;pre class="brush:csharp"&gt;public interface ICompanion&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  // your custom operations that are implemented via subclassing&lt;br /&gt;  void Foo(IOperation operation, T self);&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;The dispatch code is then modified like so:&lt;pre class="brush:csharp"&gt;static Action&amp;lt;IOperation, T&amp;gt; Load()&lt;br /&gt;{&lt;br /&gt;  var type = typeof(T);&lt;br /&gt;  // if companion interface is implemented, then dispatch directly into it&lt;br /&gt;  if (typeof(ICompanion&amp;lt;T&amp;gt;).IsAssignableFrom(type))&lt;br /&gt;  {&lt;br /&gt;    var method = type.GetMethod("Foo");&lt;br /&gt;    return Delegate.CreateDelegate(typeof(Action&amp;lt;IOperation, T&amp;gt;), null, method)&lt;br /&gt;             as Action&amp;lt;IOperation, T&amp;gt;;&lt;br /&gt;  }&lt;br /&gt;  // the usual dispatch code&lt;br /&gt;  ...&lt;/pre&gt;&lt;h1&gt;Useful Applications&lt;/h1&gt;&lt;p&gt;This pattern simulates the usefulness of type classes in Haskell, because these are essentially ad-hoc extensions added after the fact. I use this exact pattern in Sasa.Dynamics to implement safe type case and reflection patterns over the CLR primitive types, ie. ints, strings, delegates, etc., with the per-class override extension described above via the IReflected interface.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8465584751386909415?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8465584751386909415/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8465584751386909415' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8465584751386909415'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8465584751386909415'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2011/11/ad-hoc-extensions-in-net.html' title='Ad-hoc Extensions in .NET'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-5616344071297453153</id><published>2011-10-24T17:06:00.000-04:00</published><updated>2011-10-24T17:08:23.613-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='low-level programming'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='STM'/><category scheme='http://www.blogger.com/atom/ns#' term='concurrency'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>The Most General Concurrent API</title><content type='html'>Concurrent programming is hard, and many abstractions have been developed over the years to properly manage concurrent resources. The .NET base class libraries have mutexes, spinlocks, semaphores, compare-and-set (CAS) instructions, and more.&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;The Problem&lt;/h1&gt;Unfortunately, many of these standard abstractions are rather opaque, so predicting their behaviour is difficult, and enforcing a particular thread schedule is nearly impossible. For example, consider the issue of "fairness". Many abstractions to deal with threads are not fair, which is to say that they are not guaranteed to release threads in FIFO order, so a thread could &lt;a href="http://www.devx.com/tips/Tip/13398"&gt;starve&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Furthermore, these opaque abstractions do not allow the application to easily specify domain-specific scheduling behaviour. Suppose threads T0, T1, and T2 are blocked on a resource X, in that order. X becomes available, and according to FIFO order, T0 should acquire control next. However, a domain-specific requirement may state that T2 should be processed first, but there is no way to check the wait queue for the presence of T2 when using any of .NET's thread abstractions. We'd have to build some auxiliary data structures to log the fact that T2 is waiting for X, and force T0 and T1 to relinquish their control voluntarily, but these data structures already exist as the locking queue, so we're duplicating work. Furthermore, both T0 and T1 must run before T2 to relinquish control, so we're wasting valuable CPU time. It's rather wasteful all around.&lt;br /&gt;&lt;br /&gt;What I want to cover here is a pattern consisting of a few simple primitives which forms a general concurrent API. Using this API, we can easily implement all of the above threading abstractions, while still allowing arbitrary scheduling policies. I devised this approach while developing a software transactional memory (STM) library for C#, since STM requires deadlock detection and is rather sensitive to scheduling policies. None of this is achievable using the standard concurrent abstractions on the CLR, without &lt;a href="http://msdn.microsoft.com/en-us/magazine/cc163618.aspx"&gt;modifying the VM&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The paper entitled &lt;a href="http://arxiv.org/abs/1109.2638"&gt;Light-weight Locks&lt;/a&gt;&amp;nbsp;really crystallized the path I was on, and gelled the final components I was missing into an elegant, general solution. The solution I present here is merely one of the abstractions found in that paper.&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;The Solution&lt;/h1&gt;The fundamental primitives needed for a more flexible concurrent API are few: &lt;a href="http://msdn.microsoft.com/en-us/library/system.threading.interlocked.compareexchange.aspx"&gt;Interlocked.CompareExchange&lt;/a&gt;, &lt;a href="http://msdn.microsoft.com/en-us/library/system.threading.autoresetevent.aspx"&gt;AutoResetEvent&lt;/a&gt;, and &lt;a href="http://msdn.microsoft.com/en-us/library/system.threadstaticattribute.aspx"&gt;ThreadStaticAttribute&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The approach is simple, and consists of only three fundamental concepts:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;A lock-free list is associated with each resource being accessed concurrently. If structured as I describe below, managing this list requires &lt;em&gt;no memory allocation&lt;/em&gt;.&lt;/li&gt;&lt;li&gt;An AutoResetEvent is associated with each Thread via thread-static data. Each time the thread must block, it calls &lt;a href="http://msdn.microsoft.com/en-us/library/58195swd.aspx"&gt;WaitOne&lt;/a&gt; on its own AutoResetEvent after pushing itself onto the resource's lock-free list.&lt;/li&gt;&lt;li&gt;A thread releasing its control of a resource walks the list of waiting threads, and wakes up the ones that should assume control next by calling &lt;a href="http://msdn.microsoft.com/en-us/library/system.threading.eventwaithandle.set.aspx"&gt;Set&lt;/a&gt; on that Thread's AutoResetEvent.&lt;/li&gt;&lt;/ol&gt;AutoResetEvent is fundamentally thread-safe, and serves as a combined signal and mutex, and the only other point of contention is the lock-free list. The list requires only one CAS to push a waiter on the front, and occasionally a CAS to pop the front element [1].&lt;br /&gt;&lt;br /&gt;Using only the above, we can construct a mutex abstraction, like Monitor, that allows exclusive access to a resource, but permits arbitrary scheduling policy, such as FIFO, LIFO, or something custom (like priorities that don't suffer from inversion). Even more, this requires &lt;em&gt;no allocation&lt;/em&gt; at runtime. Here's all we need [2]:&lt;br /&gt;&lt;pre class="brush:csharp"&gt;// the thread-static data&lt;br /&gt;class MetaThread&lt;br /&gt;{&lt;br /&gt;  // the kernel signal+mutex to wait on&lt;br /&gt;  public AutoResetEvent Signal { get; private set; }&lt;br /&gt;  // the next MetaThread in whatever list this&lt;br /&gt;  // MetaThread happens to be on at the time&lt;br /&gt;  public MetaThread Next { get; private set; }&lt;br /&gt;  [ThreadStatic]&lt;br /&gt;  public static MetaThread Current { get; private set; }&lt;br /&gt;}&lt;br /&gt;// some resource T that requires mutually&lt;br /&gt;// exclusive access&lt;br /&gt;class Locked&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  T value;&lt;br /&gt;  MetaThread owner;&lt;br /&gt;  MetaThread blocked;&lt;br /&gt;}&lt;/pre&gt;What makes this work is the property that a thread can only be on one blocked list at any given time. This means we need only one piece of thread-static data to track which list it's on, and one AutoResetEvent to block and unblock the thread. Now suppose Locked has two public operations: Enter and Exit, which mimic the behaviour of Monitor.Enter and Monitor.Exit [3]:&lt;br /&gt;&lt;pre class="brush:csharp"&gt;class Locked&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  ...&lt;br /&gt;  public void Enter()&lt;br /&gt;  {&lt;br /&gt;    var thread = MetaThread.Current;&lt;br /&gt;    // push current thread onto wait list&lt;br /&gt;    do thread.Next = blocked;&lt;br /&gt;    while (thread.Next != Interlocked.CompareExchange(ref blocked, thread, thread.Next));&lt;br /&gt;    // if owner is null, then try to acquire lock; if that fails, block&lt;br /&gt;    if (owner != null || null != Interlocked.CompareExchange(ref owner, thread, null))&lt;br /&gt;    {&lt;br /&gt;      next.Signal.WaitOne();&lt;br /&gt;    }&lt;br /&gt;    Unblock(thread); // remove from blocked list&lt;br /&gt;  }&lt;br /&gt;  public void Exit()&lt;br /&gt;  {&lt;br /&gt;    // retry if no candidate acquired the lock, but the&lt;br /&gt;    // blocked list has since become not empty&lt;br /&gt;    MetaThread next;&lt;br /&gt;    do {&lt;br /&gt;      next = SelectNext();&lt;br /&gt;      Interlocked.Exchange(ref owner, next);&lt;br /&gt;    } while (next == null &amp;amp;&amp;amp; blocked != null);&lt;br /&gt;    if (next != null) next.Signal.Set(); // resume the thread&lt;br /&gt;  }&lt;br /&gt;  protected virtual MetaThread SelectNext()&lt;br /&gt;  {&lt;br /&gt;    ...&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;The SelectNext method can provide a variety of domain-specific selection behaviour, such as LIFO, FIFO, etc. to dequeue the next lock owner. All that's required is that the thread releasing the resource select the candidate from the list and assign the lock to it, and then signal it to continue via AutoResetEvent. Unblock is not shown for brevity, but it's not too complicated.&lt;br /&gt;&lt;br /&gt;The paper I linked to above uses this pattern to implement all the common concurrency abstractions, using very compact structures (2 bytes per mutex, 4 bytes per rw-lock, 4-bytes per condition variable).&lt;br /&gt;&lt;br /&gt;I use it in my STM library to dynamically detect deadlocks and abort transactions, and I will soon extend this to implement reader-writer lock semantics similar to a pessimistic version of &lt;a href="http://dl.acm.org/citation.cfm?id=1810531"&gt;TLRW&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Regardless, associating a kernel AutoResetEvent, or pthread signal+mutex on Linux, permits far more flexible and compact concurrency primitives. I recommend this "MetaThread" API be exposed in any language's standard threading library.&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;The Future&lt;/h1&gt;This design can also inform the design of concurrency abstractions in kernels. Consider a reader-writer lock where a writer currently has exclusive access to the resource, and a list of N readers is waiting for the release of the lock. The writer must then invoke AutoResetEvent.Set N times, once for each waiting reader. That's N user-kernel transitions, which can be quite expensive for large N [4].&lt;br /&gt;&lt;br /&gt;The ideal solution would be to group a number of individual AutoResetEvents so we only need to make one user-kernel transition to signal the whole group, sort of like a multicast for &lt;a href="http://msdn.microsoft.com/en-us/library/system.threading.eventwaithandle.aspx"&gt;EventWaitHandles&lt;/a&gt;. This would provide efficient single thread and multithread suspend/resume semantics.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Footnotes&lt;/h3&gt;[1] A CAS may be needed to remove elements further in the list if more than one thread can operate on a resource at a time.&lt;br /&gt;[2] I'm skipping a few steps to simplify the presentation. MetaThread.Current must actually be lazily-initialized. If I were starting a new programming language, this structure would be part of the standard Thread abstraction. Furthermore, when implementing abstractions which allow multiple threads through, you will probably need more synchronization. For instance, a reader-writer lock requires an atomically incremented/decremented read counter. See the paper linked above for more details.&lt;br /&gt;[3] I'm again simplifying here for presentation, so I apologize for any errors. This code is adapted from my STM library, so I haven't tested this version thoroughly, but it's at least derived from correct code. There are many optimizations to avoid redundant CAS operations, but this is the simplest presentation.&lt;br /&gt;[4] N here is the number of threads, which in general are probably less than a hundred, even for heavily threaded workloads. Still, user-kernel transitions can cost upwards of 1,000 cycles, so wasting 100,000 cycles is nothing to scoff at.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-5616344071297453153?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/5616344071297453153/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=5616344071297453153' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5616344071297453153'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5616344071297453153'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2011/10/most-general-concurrent-api.html' title='The Most General Concurrent API'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-1468936455101350696</id><published>2011-09-28T00:07:00.000-04:00</published><updated>2011-09-28T10:36:52.223-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='concurrency'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>ThreadScoped&lt;T&gt; The Next Generation</title><content type='html'>&lt;p&gt;Jeroen Frijters &lt;a href="http://weblog.ikvm.net/CommentView.aspx?guid=1c7e28aa-c938-408b-856c-dfae456303e4"&gt;helpfully pointed out&lt;/a&gt; that the CLR implements some hard limits on nesting generics, which is 99 for .NET 4 based on my tests. &lt;a href="http://higherlogics.blogspot.com/2011/09/faster-thread-local-data-with.html"&gt;My previous implementation of ThreadScoped&amp;lt;T&amp;gt;&lt;/a&gt; was thus limited to 99 instances. Not very useful!&lt;/p&gt;&lt;p&gt;The solution is actually quite simple, which I briefly outlined on Jeroen's blog: add more type parameters and use a simple base-99 counting scheme to generate new instances. Each additional type parameters thus increases the permutations 99 fold. One type index parameter yields 99&lt;sup&gt;1&lt;/sup&gt; instances, two type index parameters yields 99&lt;sup&gt;2&lt;/sup&gt; instances, three type index parameters yields 99&lt;sup&gt;3&lt;/sup&gt;, and so on.&lt;/p&gt;&lt;p&gt;No one in the foreseeable future will require more than 99&lt;sup&gt;3&lt;/sup&gt;, which is almost a million thread-local variables, so &lt;a href="http://sasa.hg.sourceforge.net/hgweb/sasa/sasa/file/ab260ccd19d1/Sasa.TM/ThreadScoped.cs"&gt;I've added two more type index parameters to make Ref&amp;lt;T0, T1, T2&amp;gt;&lt;/a&gt;. The instance allocation function is now:&lt;pre class="brush:csharp"&gt;internal override ThreadScoped&amp;lt;T&amp;gt; Allocate()&lt;br /&gt;{&lt;br /&gt;    // If 'next' is null, we are at the end of the list of free refs,&lt;br /&gt;    // so allocate a new one and enqueue it, then return 'this'&lt;br /&gt;    var x = next;&lt;br /&gt;    if (x != null) return this;&lt;br /&gt;    // The CLR has some fundamental limits on generic nesting depths, so we circumvent&lt;br /&gt;    // this by using two generic parameters, and nesting them via counting.&lt;br /&gt;    x = Interlocked.CompareExchange(ref next, CreateNext(), null);&lt;br /&gt;    // atomic swap failure doesn't matter, since the caller of Acquire()&lt;br /&gt;    // accesses whatever instance is at this.next&lt;br /&gt;    return this;&lt;br /&gt;}&lt;/pre&gt;and CreateNext is:&lt;pre class="brush:csharp"&gt;ThreadScoped&amp;lt;T&amp;gt; CreateNext()&lt;br /&gt;{&lt;br /&gt;    var x = allocCount + 1;&lt;br /&gt;    if (x % (99 * 99) == 0) return new Ref&amp;lt;T, T, Ref&amp;lt;T2&amp;gt;&amp;gt; { allocCount = x };&lt;br /&gt;    if (x % 99 == 0)        return new Ref&amp;lt;T, Ref&amp;lt;T1&amp;gt;, T2&amp;gt; { allocCount = x };&lt;br /&gt;    return new Ref&amp;lt;Ref&amp;lt;T0&amp;gt;, T1, T2&amp;gt; { allocCount = x };&lt;br /&gt;}&lt;/pre&gt;This is simple &lt;a href="http://www.google.ca/search?q=number+base+aritmethic"&gt;base-99 arithmetic&lt;/a&gt;. Anyone familiar with arithmetic should recognize the pattern here: when we get to certain multiples of 99, we reset the previous digits and carry the 1 to the next slot. Normally, humans deal in base-10, so a carry happens at 10&lt;sup&gt;1&lt;/sup&gt;, 10&lt;sup&gt;2&lt;/sup&gt;, 10&lt;sup&gt;3&lt;/sup&gt;, and so on.&lt;/p&gt;&lt;p&gt;In this case, we are dealing with base-99, so carries happen at 99&lt;sup&gt;1&lt;/sup&gt;, 99&lt;sup&gt;2&lt;/sup&gt; and 99&lt;sup&gt;3&lt;/sup&gt;, and the "carry" operation consists of nesting a generic type parameter. Simple!&lt;/p&gt;&lt;p&gt;This scheme is also trivially extensible to as many additional parameters as is needed, so if someone somewhere really does need more than a million fast thread-local variables, I have you covered.&lt;/p&gt;&lt;p&gt;These changes don't seem to have impacted the performance of ThreadScoped&amp;lt;T&amp;gt;, so &lt;a href="http://higherlogics.blogspot.com/2011/09/faster-thread-local-data-with.html"&gt;I'm still over 250% faster&lt;/a&gt; than the &lt;a href="http://msdn.microsoft.com/en-us/library/dd642243.aspx"&gt;ThreadLocal&amp;lt;T&amp;gt; provided in .NET 4's base class libraries.&lt;/a&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-1468936455101350696?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/1468936455101350696/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=1468936455101350696' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1468936455101350696'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1468936455101350696'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2011/09/threadscoped-next-generation.html' title='ThreadScoped&amp;lt;T&amp;gt; The Next Generation'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-6568941780573448507</id><published>2011-09-27T00:18:00.000-04:00</published><updated>2011-09-28T10:13:57.439-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='benchmarks'/><category scheme='http://www.blogger.com/atom/ns#' term='concurrency'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>Faster Thread-Local Data with ThreadScoped&lt;T&gt;</title><content type='html'>&lt;p&gt;Awhile ago, &lt;a href="http://weblog.ikvm.net/CommentView.aspx?guid=1c7e28aa-c938-408b-856c-dfae456303e4"&gt;Jeroen Frijters blogged&lt;/a&gt; about .NET 4's new ThreadLocal&amp;lt;T&amp;gt;, and how it was implemented. It was based on a neat generics trick exploiting the fact that the CLR does not erase types. Indexing a generic type X with a type T, means X&amp;lt;string&amp;gt; has distinct static and thread-local fields from say, X&amp;lt;int&amp;gt;.&lt;/p&gt;&lt;p&gt;I've actually &lt;a href="http://higherlogics.blogspot.com/2010/12/sasa-v093-released.html"&gt;used this trick before to implement&lt;/a&gt; a type of safe reflection.&lt;/p&gt;&lt;p&gt;Jeroen's sample implementation exploiting this trick for thread-local instance variables seemed far too complicated however, and it suffered from a number of limitations that he notes in his post. If Microsoft's implementation was even more complicated as was claimed, I wasn't confident it would perform very well at all. I decided to do my own implementation, without conforming the ThreadLocal&amp;lt;T&amp;gt;'s interface since I have very specific applications in mind.&lt;/p&gt;&lt;h1&gt;ThreadScoped&amp;lt;T&amp;gt;&lt;/h1&gt;&lt;p&gt;As I suspected, &lt;a href="http://sasa.hg.sourceforge.net/hgweb/sasa/sasa/file/fe7946510903/Sasa.TM/ThreadScoped.cs"&gt;my implementation was considerably simpler&lt;/a&gt;, coming in at less than 90 lines of code, and it doesn't suffer from the problems that Jeroen identifies, save for the fact generated types are not reclaimed until the AppDomain exits. However, ThreadScoped instances are aggressively reused, so I don't anticipate this to be a huge problem.&lt;/p&gt;&lt;p&gt;Conceptually, the implementation is quite simple. We have four views to keep in mind here:&lt;ol&gt;&lt;li&gt;Thread-global static data: ordinary static fields that are thus visible by all threads.&lt;/li&gt;&lt;li&gt;Thread-global instance data: ordinary object fields visible to all threads with access to that instance.&lt;/li&gt;&lt;li&gt;Thread-local static data: static fields marked with [ThreadStatic], giving a thread-specific view of that field.&lt;/li&gt;&lt;li&gt;Thread-local instance data: not provided natively by the CLR, but simulated by ThreadLocal&amp;lt;T&amp;gt; and ThreadScoped&amp;lt;T&amp;gt; via thread-local static data + generics.&lt;/li&gt;&lt;/ol&gt;Our thread-global static data consists of a lock-free linked list of unallocated ThreadScoped instances. Our thread-global instance data is a version number, and a 'next' pointer that points to the next available ThreadScoped&amp;lt;T&amp;gt;. This is used when freeing an instance by pushing it on the global list. Accessing the global and instance data are all done via &lt;a href="http://msdn.microsoft.com/en-us/library/system.threading.interlocked.compareexchange.aspx"&gt;Interlocked.CompareExchange&lt;/a&gt;, so all updates are lock-free.&lt;/p&gt;&lt;p&gt;Now comes the tricky part: the private type ThreadScoped&amp;lt;T&amp;gt;.Ref&amp;lt;TIndex&amp;gt; has two static fields, one for the value T and one for a version number (more on this later):&lt;pre class="brush:csharp"&gt;sealed class Ref&amp;lt;TIndex&amp;gt; : ThreadScoped&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  [ThreadStatic]&lt;br /&gt;  static T scoped;     // the unique thread-local slot&lt;br /&gt;  [ThreadStatic]&lt;br /&gt;  static uint version; // the version number of the thread-local data&lt;br /&gt;}&lt;/pre&gt;&lt;p&gt;The type parameter TIndex is not actually used in a static or instance field, it's merely used here as a &lt;a href="http://www.google.ca/search?q=phantom+type"&gt;"phantom type"&lt;/a&gt;. Basically, if we keep substituting new types for TIndex, we'll keep forcing the CLR to generate new thread local static fields for us that we can use to simulate thread-local instance fields!&lt;/p&gt;&lt;p&gt;This is done in the Ref&amp;lt;TIndex&amp;gt;.Allocate() method. The global instance list always contains at least one unallocated instance. Whenever we try to allocate a new ThreadScoped&amp;lt;T&amp;gt;, we check whether we're down to the last instance in the list. If so, this last instance will generate a Ref&amp;lt;Ref&amp;lt;TIndex&amp;gt;&amp;gt; and enqueue that on the end:&lt;pre class="brush:csharp"&gt;&lt;br /&gt;internal override ThreadScoped&amp;lt;T&amp;gt; Allocate()&lt;br /&gt;{&lt;br /&gt;    // if 'next' is null, we are at the end of the list of free refs,&lt;br /&gt;    // so allocate a new one and enqueue it, then return 'this'&lt;br /&gt;    var x = next;&lt;br /&gt;    if (x != null) return this;&lt;br /&gt;    x = Interlocked.CompareExchange(ref next, new Ref&amp;lt;Ref&amp;lt;TIndex&amp;gt;&amp;gt;(), null);&lt;br /&gt;    // atomic swap failure doesn't matter, since the caller of Acquire()&lt;br /&gt;    // accesses whatever instance is at this.next&lt;br /&gt;    return this;&lt;br /&gt;}&lt;/pre&gt;&lt;p&gt;ThreadScoped.Create then pops the front of the list to obtain the next to last instance, and we're back to only having one unallocated instance.&lt;/p&gt;&lt;p&gt;There's an important invariant here: there is always at least one item in the global list, and the last item on the global list is always the most deeply nested generic type that has been generated so far for  ThreadScoped&amp;lt;T&amp;gt;. This means when we get to the last remaining unallocated instance, we can always safely generate a new instance without interfering with other threads.&lt;/p&gt;&lt;p&gt;The version numbers also bear some explanation. Basically, even if Thread0 disposes of an ThreadScoped instance, Thread1 may have data sitting in that instance. Suppose that the ThreadScoped instance is then pushed on the free list, and then allocated again. If Thread1 participates in that second computation, it will already find data sitting in its supposedly fresh thread-local instance from the last computation where it was supposed to have been disposed!&lt;/p&gt;&lt;p&gt;Obviously this is not what we want, but while Thread0 is disposing of the instance, it can't access Thread1's thread-local fields to clear them. This is the purpose of the instance and the thread-local version numbers. During dispose, we increment the instance's version number. If the thread-local version number and the instance version number don't match, it means the instance was disposed in the intervening time, so we should reinitialize it before proceeding:&lt;/p&gt;&lt;pre class="brush:csharp"&gt;public override T Value&lt;br /&gt;{&lt;br /&gt;    get { return current == version ? scoped : Update(); }&lt;br /&gt;    set { scoped = value; }&lt;br /&gt;}&lt;br /&gt;T Update()&lt;br /&gt;{&lt;br /&gt;    if (next != this) throw new ObjectDisposedException("Transacted&amp;lt;T&amp;gt; has been disposed.");&lt;br /&gt;    version = current;&lt;br /&gt;    return scoped = default(T);&lt;br /&gt;}&lt;/pre&gt;&lt;p&gt;And that's it! There are a few other minor details related to book keeping that aren't really important. I think that's pretty much as simple as you can get, and since this results in effectively a direct pointer to thread-local data, it should be quite fast. There are also no intrinsic allocation limits, as it will just keep allocating or reusing instances as needed.&lt;/p&gt;&lt;h1&gt;Benchmarks&lt;/h1&gt;&lt;p&gt;&lt;a href="https://higherlogics.sourcerepo.com/hg/higherlogics_benchmarks/rev/1e75a9fd032f"&gt;Here is the source for the benchmarks I used&lt;/a&gt;. The main loop runs 10,000,000 iterations of a simple calculation that performs one read and one write to the thread-local instance per iteration. I ran this test 20 times in a loop, logging the number of iterations per second on each run. Then I used &lt;a href="http://higherlogics.blogspot.com/2010/05/peirces-criterion-command-line-tool.html"&gt;Peirce's Criterion&lt;/a&gt; to filter out the outliers, and used the resulting data set.&lt;/p&gt;The numbers are iterations/second, benchmarks were run on .NET 4 and Windows 7. The results are fairly compelling:&lt;/p&gt;&lt;table border="0" cellspacing="0" cols="4" frame="VOID" rules="NONE"&gt;	&lt;colgroup&gt;&lt;col width="118"&gt;&lt;/col&gt;&lt;col width="118"&gt;&lt;/col&gt;&lt;col width="131"&gt;&lt;/col&gt;&lt;col width="104"&gt;&lt;/col&gt;&lt;/colgroup&gt;	&lt;tbody&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18" width="118"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="LEFT" width="118"&gt;&lt;b&gt;&lt;i&gt;&lt;u&gt;ThreadLocal&amp;lt;T&amp;gt;&lt;t&gt;&lt;/t&gt;&lt;/u&gt;&lt;/i&gt;&lt;/b&gt;&lt;/td&gt;			&lt;td align="LEFT" width="131"&gt;&lt;b&gt;&lt;i&gt;&lt;u&gt;ThreadScoped&amp;lt;T&amp;gt;&lt;t&gt;&lt;/t&gt;&lt;/u&gt;&lt;/i&gt;&lt;/b&gt;&lt;/td&gt;			&lt;td align="LEFT" width="104"&gt;&lt;b&gt;&lt;i&gt;&lt;u&gt;ThreadStatic&lt;/u&gt;&lt;/i&gt;&lt;/b&gt;&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5022000"&gt;5022000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16260000"&gt;16260000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="22396000"&gt;22396000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5042000"&gt;5042000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="14378000"&gt;14378000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="18484000"&gt;18484000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="4972000"&gt;4972000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16514000"&gt;16514000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="20790000"&gt;20790000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5002000"&gt;5002000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="15722000"&gt;15722000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="22470000"&gt;22470000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5070000"&gt;5070000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="14244000"&gt;14244000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="19860000"&gt;19860000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5098000"&gt;5098000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="13946000"&gt;13946000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16570000"&gt;16570000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5076000"&gt;5076000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16474000"&gt;16474000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="22246000"&gt;22246000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5102000"&gt;5102000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="13994000"&gt;13994000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="20532000"&gt;20532000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5012000"&gt;5012000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="13494000"&gt;13494000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="14482000"&gt;14482000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5108000"&gt;5108000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16090000"&gt;16090000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="21598000"&gt;21598000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5074000"&gt;5074000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16778000"&gt;16778000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="20000000"&gt;20000000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5104000"&gt;5104000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="15408000"&gt;15408000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="21620000"&gt;21620000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5076000"&gt;5076000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="12762000"&gt;12762000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16312000"&gt;16312000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5054000"&gt;5054000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16792000"&gt;16792000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="21008000"&gt;21008000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5064000"&gt;5064000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16380000"&gt;16380000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="21856000"&gt;21856000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5108000"&gt;5108000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16154000"&gt;16154000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="15910000"&gt;15910000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5066000"&gt;5066000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16750000"&gt;16750000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="22988000"&gt;22988000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5108000"&gt;5108000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16076000"&gt;16076000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="18778000"&gt;18778000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5012000"&gt;5012000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="15986000"&gt;15986000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="23094000"&gt;23094000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="LEFT"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="LEFT"&gt;&lt;br /&gt;&lt;/td&gt;			&lt;td align="LEFT"&gt;&lt;br /&gt;&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;b&gt;&lt;i&gt;MIN&lt;/i&gt;&lt;/b&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="4972000"&gt;4972000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="12762000"&gt;12762000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="14482000"&gt;14482000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;b&gt;&lt;i&gt;MAX&lt;/i&gt;&lt;/b&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="5108000"&gt;5108000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="16792000"&gt;16792000&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;" sdval="23094000"&gt;23094000&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;b&gt;&lt;i&gt;AVG&lt;/i&gt;&lt;/b&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;0;0" sdval="5061578"&gt;5061579&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;0;0" sdval="15484315"&gt;15484316&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;0;0" sdval="20052315"&gt;20052316&lt;/td&gt;		&lt;/tr&gt;&lt;tr&gt;			&lt;td align="LEFT" height="18"&gt;&lt;b&gt;&lt;i&gt;% Overhead relative&lt;br /&gt;to ThreadStatic&lt;/i&gt;&lt;/b&gt;&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;0;0" sdval="296"&gt;296%&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;0;0" sdval="29"&gt;29%&lt;/td&gt;			&lt;td align="RIGHT" sdnum="4105;0;0" sdval="0"&gt;0%&lt;/td&gt;		&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p&gt;Microsoft's implementation takes a real beating in this simple benchmark with almost 300% overhead over raw ThreadStatic fields. I'm not sure what could be going on in there to cause such a significant slowdown&lt;/p&gt;&lt;p&gt;By contrast, ThreadScoped has a modest 30% overhead, which is far more reasonable. I suspect this overhead is due to two factors: 1. virtual dispatch overhead because ThreadScoped.Value is a virtual property, and 2. the encapsulated instance for Ref &amp;lt;TIndex&amp;gt; may require a bit of pointer-chasing to resolve the right thread-local static field. I can't think of a way to eliminate this overhead, so this is as good as it's going to get for now.&lt;/p&gt;&lt;p&gt;I will note that I've had a few programs where ThreadLocal&amp;lt;T&amp;gt; did not perform as badly as demonstrated above, so it may be that reads or writes are more penalized. However, no benchmark or program I tested had ThreadLocal outperforming ThreadScoped, so I can say with confidence that ThreadScoped is much faster than ThreadLocal.&lt;/p&gt;&lt;p&gt;Jeroen's post also implied that he had tested a ThreadLocal version that used arrays for the backing store, and that it was faster. I also implemented a version of ThreadScoped using arrays, but haven't tested it enough. It did seem quite a bit faster with some code I had been playing with, but there were many uncontrolled variables, so I can't reach any conclusion with any confidence. The array-backed version is commited to the Sasa repository for the adventurous though. There are serious limitations with using arrays as a backing store however, namely that you're stuck with a fixed number of instances defined at compile-time, and allocating large enough arrays to avoid falling back on slow thread-local data wastes quite a bit of memory.&lt;/p&gt;&lt;p&gt;Still, I will run some benchmarks at some future date, so stay tuned! As for ThreadScoped&amp;lt;T&amp;gt;, it will probably be in the next &lt;a href="http://sourceforge.net/projects/sasa/"&gt;Sasa&lt;/a&gt; release coming in a month or two, or you can just grab it from the Mercurial repo.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-6568941780573448507?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/6568941780573448507/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=6568941780573448507' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6568941780573448507'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6568941780573448507'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2011/09/faster-thread-local-data-with.html' title='Faster Thread-Local Data with ThreadScoped&amp;lt;T&amp;gt;'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-1441982466138693174</id><published>2011-09-25T11:46:00.000-04:00</published><updated>2011-09-26T02:14:12.819-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='LINQ'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>Idioms in C# with LINQ</title><content type='html'>&lt;p&gt;There's a &lt;a href="http://tomasp.net/blog/idioms-in-linq.aspx"&gt;great post&lt;/a&gt; on implementing &lt;a href="http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf"&gt;idioms&lt;/a&gt; with LINQ, and the example application was to implement formlets, as &lt;a href="http://www.websharper.com/"&gt;WebSharper&lt;/a&gt; does for F#. Tomas's post is well written, so if you're unclear on the above concepts I recommend reading it first before proceeding with this article.&lt;/p&gt;&lt;p&gt;The claim in that post is that idioms can only be encoded via LINQ's 'join' operators. While strictly true if you stick to all the LINQ rules, because LINQ queries are just naive syntactic transforms you don't have to follow the rules. You can thus exploit this to hijack the signatures for the SelectMany overloads to yield idiom signatures. It's not all sunshine and roses though, as there are consequences.&lt;/p&gt;&lt;h1&gt;Overview&lt;/h1&gt;&lt;p&gt;&lt;a href="http://msdn.microsoft.com/library/bb397926.aspx"&gt;LINQ&lt;/a&gt; is a standard set of methods one can implement that the C# compiler can use to provide "query patterns". This query:&lt;pre class="brush:csharp"&gt;var foo = from x in SomeFoo&lt;br /&gt;          from y in foo.Values&lt;br /&gt;          select y;&lt;/pre&gt;is translated by the C# compiler to:&lt;pre class="brush:csharp"&gt;var foo = SomeFoo.SelectMany(x =&gt; x.Values, (x, y) =&gt; y);&lt;/pre&gt;This is a purely syntactic transformation, meaning that the C# compiler simply takes the text from above, and naively translates each 'from', 'where', 'select', etc. into calls to instance or extension methods, SelectMany, Where, Select, etc. Type inference must then be able to infer the types used in your query, and everything must type check.&lt;/p&gt;&lt;p&gt;The fact that we're dealing with a purely syntactic transform means that we can be sneaky and alter the signatures of these LINQ functions and the C# compiler would be none the wiser. The resulting calls to the LINQ methods would still need to compile, but we can ensure that they only compile following the rules we want, in this case, of the rules of idioms.&lt;p&gt;&lt;h1&gt;LINQ Methods&lt;/h1&gt;&lt;p&gt;The core LINQ methods are as follows, using Formlet&amp;lt;T&amp;gt; as the LINQ type:&lt;pre class="brush: csharp"&gt;Formlet&amp;lt;R&amp;gt; Select&amp;lt;T, R&amp;gt;(this Formlet&amp;lt;T&amp;gt; f, Func&amp;lt;T, R&amp;gt; selector);&lt;br /&gt;Formlet&amp;lt;R&amp;gt; SelectMany&amp;lt;R&amp;gt;(this Formlet&amp;lt;T&amp;gt; f,&lt;br /&gt;                              Func&amp;lt;T, Formlet&amp;lt;R&amp;gt;&amp;gt; collector);&lt;br /&gt;Formlet&amp;lt;R&amp;gt; SelectMany&amp;lt;U, R&amp;gt;(this Formlet&amp;lt;T&amp;gt; f,&lt;br /&gt;                                 Func&amp;lt;T, Formlet&amp;lt;U&amp;gt;&amp;gt;&lt;br /&gt;                                 Func&amp;lt;T, U, R&amp;gt; selector);&lt;/pre&gt;The problematic methods for idioms are the two SelectMany calls, specifically, the parameter I've called 'collector'. You can see that the LINQ type is unwrapped and the value extracted on each SelectMany, and passed to the rest of the query. Accessing the previous values like this is forbidden in idioms.&lt;/p&gt;&lt;p&gt;Fortunately, the signatures for SelectMany don't &lt;em&gt;have&lt;/em&gt; to have this exact signature, they must only have a similar &lt;em&gt;structure&lt;/em&gt;. You must have two SelectMany overloads with one and two delegate parameters, and the first delegate parameter must return your LINQ type, in this case Formlet&amp;lt;T&amp;gt;, as this allows you to chain query clauses one after another. You can also modify the second delegate parameter in various ways, but I haven't found much use for that myself.&lt;/p&gt;&lt;p&gt;To implement idioms, we will simply alter the first delegate parameter so instead of unwrapping the value encapsulated by the Formlet&amp;lt;T&amp;gt;, we simply pass the Formlet&amp;lt;T&amp;gt; itself:&lt;pre class="brush: csharp"&gt;Formlet&amp;lt;R&amp;gt; Select&amp;lt;T, R&amp;gt;(this Formlet&amp;lt;T&amp;gt; f, Func&amp;lt;T, R&amp;gt; selector);&lt;br /&gt;Formlet&amp;lt;R&amp;gt; SelectMany&amp;lt;R&amp;gt;(this Formlet&amp;lt;T&amp;gt; f,&lt;br /&gt;                              Func&amp;lt;Formlet&amp;lt;T&amp;gt;, Formlet&amp;lt;R&amp;gt;&amp;gt; collector);&lt;br /&gt;Formlet&amp;lt;R&amp;gt; SelectMany&amp;lt;U, R&amp;gt;(this Formlet&amp;lt;T&amp;gt; f,&lt;br /&gt;                                 Func&amp;lt;Formlet&amp;lt;T&amp;gt;, Formlet&amp;lt;U&amp;gt;&amp;gt; collector,&lt;br /&gt;                                 Func&amp;lt;T, U, R&amp;gt; selector);&lt;/pre&gt;Our query above:&lt;pre class="brush:csharp"&gt;var foo = from x in SomeFoo&lt;br /&gt;          from y in foo.Values&lt;br /&gt;          select y;&lt;/pre&gt;would then no longer compile, because 'x' is now not a Foo, but is in fact a Formlet&amp;lt;Foo&amp;gt;, and the formlet type does not have a "Values" property. Of course, you shouldn't provide a property to extract the encapsulated value, or this is all for naught.&lt;/p&gt;&lt;h1&gt;The Downsides&lt;/h1&gt;&lt;p&gt;Simple queries work great, but longer queries may run into some problems if you alter the LINQ signatures. In this case, if you try to access previous values by mistake, as in our example query above, you will get a complicated error message:&lt;pre&gt;Error 1       Could not find an implementation of the query pattern for source type 'Formlet.Formlet&amp;lt;AnonymousType#1&amp;gt;'. '&amp;lt;&amp;gt;h__TransparentIdentifier0' not found.&lt;/pre&gt;Basically, your incorrect program was naively translated to use the LINQ methods, but because it does not properly match the type signatures you've hijacked, type inference fails. So you can't break your idioms by hijacking the query pattern this way, but depending on your target audience, perhaps you will render them unusable.&lt;/p&gt;&lt;p&gt;Still, it's a neat trick that should be in every type wizard's toolbox.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-1441982466138693174?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/1441982466138693174/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=1441982466138693174' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1441982466138693174'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1441982466138693174'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2011/09/idioms-in-c-with-linq.html' title='Idioms in C# with LINQ'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-2814892420471569255</id><published>2011-09-17T13:34:00.000-04:00</published><updated>2011-09-26T01:34:53.154-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='reactive programming'/><title type='text'>IObservable&lt;T&gt; and Delegate Equality</title><content type='html'>Equality and identity are often intermingled in the .NET framework. .NET delegates correctly implement a straightforward &lt;a href="http://msdn.microsoft.com/en-us/library/aa664729.aspx"&gt;structural equality&lt;/a&gt;, where if two delegates are equal if they designate the same method and the same object, regardless of whether the delegate is a unique instance:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;class Bar&lt;br /&gt;{&lt;br /&gt;    public void Foo(int i)&lt;br /&gt;    {&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;var bar = new Bar();&lt;br /&gt;Action&amp;lt;int&amp;gt; a1 = bar.Foo;&lt;br /&gt;Action&amp;lt;int&amp;gt; a2 = bar.Foo;&lt;br /&gt;Console.WriteLine("Delegate Equality: {0}", a1 == a2);&lt;br /&gt;// prints:&lt;br /&gt;// Delegate Equality: True&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;However, an IObservable&amp;lt;T&amp;gt; created from delegates does not respect structural equality, and reverts to an identity criterion:&lt;br /&gt;&lt;pre class="brush:csharp"&gt;// use a1 and a2 from previous code sample&lt;br /&gt;var o1 = Observer.Create&amp;lt;int&amp;gt;(a1);&lt;br /&gt;var o2 = Observer.Create&amp;lt;int&amp;gt;(a2);&lt;br /&gt;Console.WriteLine("Observable Equality: {0}", o1 == o2 || o1.Equals(o2)&lt;br /&gt;                                           || EqualityComparer&amp;lt;IObserver&amp;lt;int&amp;gt;&amp;gt;.Default.Equals(o1, o2));&lt;br /&gt;//prints:&lt;br /&gt;// Observable Equality: False&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;This is unfortunate, because it means that by default we cannot implement previous event handling patterns using IObservable without significantly restructuring the code to propagate the IDisposable returned by &lt;a href="http://msdn.microsoft.com/en-us/library/dd782981.aspx"&gt;IObservable.Subscribe&lt;/a&gt;. By this, I mean code that properly managed event registration and unregistration has no easy transition to using IObservable, it must be completely rewritten.&lt;br /&gt;&lt;br /&gt;Like event equality, the equality of IObservers created from delegates should be structural, not based on identity. Thus, manually managing subscriptions would be possible via an explicit "Unsubscribe" operation.&lt;br /&gt;&lt;br /&gt;This decision has a real consequence that I just hit: I can implement IObservable given an object implementing &lt;a href="http://msdn.microsoft.com/en-us/library/system.componentmodel.inotifypropertychanged.aspx"&gt;INotifyPropertyChanged&lt;/a&gt;, but could not do the reverse using the built-in abstractions. You'd either have to define your own IObservers that implement structural equality, or you'd have to store the actual event locally and trigger it manually when a new value is available, as I have done with &lt;a href="http://sasa.hg.sourceforge.net/hgweb/sasa/sasa/file/f09e625d8b0d/Sasa.Reactive/NamedProperty.cs"&gt;NamedProperty&amp;lt;T&amp;gt;&lt;/a&gt; in Sasa.&lt;br /&gt;&lt;br /&gt;On a slightly related note, I've switched Sasa over to use Mercurial on Sourceforge, and have closed the old subversion repo.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-2814892420471569255?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/2814892420471569255/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=2814892420471569255' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2814892420471569255'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2814892420471569255'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2011/09/iobservable-and-delegate-equality.html' title='IObservable&amp;lt;T&amp;gt; and Delegate Equality'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8617093652128713522</id><published>2011-08-21T14:59:00.000-04:00</published><updated>2011-09-26T01:35:20.076-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>Open Instance Delegate for Generic Interface Methods</title><content type='html'>Ran into a snag when developing the latest abstraction for Sasa. The problem is distilled to its simplest form in this &lt;a href="http://stackoverflow.com/questions/7136615/open-delegate-for-generic-interface-method"&gt;Stack Overflow&lt;/a&gt; question.&lt;br /&gt;&lt;br /&gt;In summary, while the CLR supports &lt;a href="http://peisker.net/dotnet/languages2005.htm#delegatetargets"&gt;open instance delegates to interface methods&lt;/a&gt;, and supports delegates to generic interface methods, it does not seem to support the composition of the two, ie. an open instance delegate to a generic interface method.&lt;br /&gt;&lt;br /&gt;The following trivial example fails when creating the delegate with a NotSupportedException:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;interface IFoo&lt;br /&gt;{&lt;br /&gt;  void Bar&amp;lt;T&amp;gt;(T j);&lt;br /&gt;}&lt;br /&gt;class Foo : IFoo&lt;br /&gt;{&lt;br /&gt;  public void Bar&amp;lt;T&amp;gt;(T j)&lt;br /&gt;  {&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;static void Main(string[] args)&lt;br /&gt;{&lt;br /&gt;  var bar = typeof(IFoo).GetMethod("Bar").MakeGenericMethod(typeof(int));&lt;br /&gt;  var x = Delegate.CreateDelegate(typeof(Action&amp;lt;IFoo, int&amp;gt;), null, bar);&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;However, the CLR does support open instance delegates to generic &lt;em&gt;class&lt;/em&gt; methods. For some reason interfaces are singled out here, and it's not clear why.&lt;br /&gt;&lt;br /&gt;If I'm doing something wrong, please let me know!&lt;br /&gt;&lt;br /&gt;Edit: I &lt;a href="https://connect.microsoft.com/VisualStudio/feedback/details/685053/open-instance-delegate-to-generic-interface-method-does-not-work"&gt;submitted a bug to Microsoft Connect&lt;/a&gt;, since this now looks like a legit bug.&lt;br /&gt;&lt;br /&gt;Edit 2: &lt;a href="https://connect.microsoft.com/VisualStudio/feedback/details/685053/open-instance-delegate-to-generic-interface-method-does-not-work#tabs"&gt;MS acknowledged the limitation&lt;/a&gt;, but said it wouldn't be solved in the current version of .NET. As I explain in that bug, this limitation just doesn't seem to make sense, but it must be due to the internals that don't reuse the existing CLR dispatching logic.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8617093652128713522?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8617093652128713522/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8617093652128713522' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8617093652128713522'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8617093652128713522'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2011/08/open-instance-delegate-for-generic.html' title='Open Instance Delegate for Generic Interface Methods'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-7032554156942798185</id><published>2010-12-05T21:59:00.001-05:00</published><updated>2011-03-16T10:20:36.583-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>Sasa v0.9.3 Released!</title><content type='html'>I recently realized that it's been over a year since I last put out a stable &lt;a href='http://sourceforge.net/projects/sasa/'&gt;Sasa release&lt;/a&gt;. Sasa is in production use in a number of applications, but the &lt;a href="http://sourceforge.net/projects/sasa/files/"&gt;stable releases on Sourceforge&lt;/a&gt; have lagged somewhat, and a number of fixes and enhancements have been added since v0.9.2.&lt;br /&gt;&lt;br /&gt;So I decided to simply exclude the experimental and broken abstractions and push out &lt;a href="http://sourceforge.net/projects/sasa/files/sasa/v0.9.3/Sasa-v0.9.3-docs.zip/download"&gt;a new release&lt;/a&gt; so others could benefit from everything &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/"&gt;Sasa v0.9.3&lt;/a&gt; has to offer.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://sourceforge.net/projects/sasa/files/sasa/v0.9.3/CHANGELOG.txt/download"&gt;changelog&lt;/a&gt; contains a full list of changes, too numerous to count. I'll list here a few of the highlights.&lt;br /&gt;&lt;h3&gt;IL Rewriter&lt;/h3&gt;&lt;br /&gt;C# unfortunately &lt;a href="http://stackoverflow.com/questions/191940/c-generics-wont-allow-delegate-type-constraints"&gt;forbids certain types&lt;/a&gt; from being used as generic type constraints, even though these constraints are available to CIL. For instance, the following is legal CIL but illegal in C#:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public void Foo&amp;lt;T&amp;gt;(T value)&lt;br /&gt; where T : Delegate&lt;br /&gt;{&lt;br /&gt; ...&lt;br /&gt;}&lt;/pre&gt;Sasa now provides a solution for this using its ilrewrite tool. The above simply uses the Sasa.TypeConstraint&amp;lt;T&amp;gt; in the type constraint and the rewriter will erase all references to TypeConstraint leaving the desired constraints in place:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public void Foo&amp;lt;T&amp;gt;(T value)&lt;br /&gt; where T : TypeConstraint&amp;lt;Delegate&amp;gt;&lt;br /&gt;{&lt;br /&gt; ...&lt;br /&gt;}&lt;/pre&gt;The Sasa library itself &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/"&gt;makes pervasive use of these type constraints&lt;/a&gt; to provide generic versions of static functions on System.Enum, thread-safe delegates add/remove, null-safe delegate calls, and more.&lt;br /&gt;Simply call ilrewrite like so:&lt;br /&gt;&lt;pre&gt;ilrewrite /verify /dll:[your dll] /[Debug | Release]&lt;/pre&gt;&lt;br /&gt;The /verify option runs peverify to ensure the rewrite produced verifiable IL. Pass /Debug if you're rewriting a debug build, and /Release if you're rewriting a release build. I have it set up as a Visual Studio post-build event, so I call it with /$(ConfigurationName) for this parameter.&lt;br /&gt;&lt;h3&gt;Thread-safe and Null-Safe Event Handling&lt;/h3&gt;&lt;br /&gt;The CLR provides first-class functions in the form of delegates, but there are various problems that commonly creep up which &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/293.html"&gt;Sasa.Events&lt;/a&gt; is designed to fix:&lt;br /&gt;&lt;h4&gt;Invoking delegates is not null-safe&lt;/h4&gt;&lt;br /&gt;Before invoking a delegate, you must explicitly check whether the delegate is null. If the delegate is in a field instead of a local, you must first copy the field value to a local or you leave yourself open to a concurrency bug, where another thread may make the field null between the time you checked and the time you call it (this is commonly known as a TOCTTOU bug, ie. &lt;a href="http://en.wikipedia.org/wiki/Time-of-check-to-time-of-use"&gt;Time-Of-Check-To-Time-Of-Use&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;This involves laborious and tedious code duplication that the C# compiler could have easily generated for us. The Sasa.Events.Raise overloads solve both of the above problems. Instead of:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;var dlg = someDelegate;&lt;br /&gt;if (dlg != null) dlg(x, y, z);&lt;/pre&gt;&lt;br /&gt;you can simply call:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;someDelegate.Raise(x, y, z);&lt;/pre&gt;This is null-safe, and thread-safe.&lt;br /&gt;&lt;h4&gt;Event add/remove is not thread-safe&lt;/h4&gt;&lt;br /&gt;Events are a pretty useful idiom common to .NET programs, but concurrency adds a number of hazards for which the C# designers provided less than satisfactory solutions.&lt;br /&gt;&lt;br /&gt;For instance, declaring a publicly accessible event creates a hidden "lock object" that the add/remove handlers first lock before modifying the event property. This is not only wasteful in memory, it's also expensive in highly concurrent scenarios. Furthermore, &lt;a href="http://blogs.msdn.com/b/cburrows/archive/2010/03/08/events-get-a-little-overhaul-in-c-4-part-ii-semantic-changes-and.aspx"&gt;this auto-locking behaviour&lt;/a&gt; is completely different for code residing inside the class as compared to code outside the class. Needless to say, this unnecessarily subtle semantics was constantly surprising C# developers.&lt;br /&gt;&lt;br /&gt;Enter Sasa.Events.Add/Remove. These overloads accept a ref to a delegate field, and perform an atomic compare and exchange on the field directly, eliminating the need for lock objects, and providing more scalable event registration/unregistration. Code that looked like this:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;event Action fooEvent;&lt;br /&gt;...&lt;br /&gt;fooEvent += newHandler;&lt;/pre&gt;or like this:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;event Action fooEvent;&lt;br /&gt;...&lt;br /&gt;lock (this) fooEvent += newHandler;&lt;/pre&gt;can now both be replaced by this:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;Action fooEvent;&lt;br /&gt;...&lt;br /&gt;Events.Add(ref fooEvent, newHandler);&lt;/pre&gt;This code has less overhead than the standard event registration code currently generated by any C# compiler, in both concurrent and non-concurrent settings.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Safe, Statically-Typed and Blazingly-Fast Reflection&lt;/h3&gt;&lt;br /&gt;Reflection is incredibly useful, and incredibly dangerous. You are forced to work with your objects as untyped data which makes it difficult to write correct programs, and the compiler can't help you.&lt;br /&gt;&lt;br /&gt;Most operations using reflection are functions operating over the structure of types. To make reflection safe, we only need a single reflective function that breaks apart an object into a stream of field values. The client then provides a stream processing function (the reflection function) that handles all the type cases that it might encounter.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/5.html"&gt;Sasa.Dynamics&lt;/a&gt; is here to help. &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/2/365.html"&gt;Type&amp;lt;T&amp;gt;.Reflect&lt;/a&gt; is a static reflection function for type T which breaks up instances of type T into its fields (use &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/2/363.html"&gt;DynamicType.Reflect&lt;/a&gt; if you're not sure of the concrete type).&lt;br /&gt;&lt;br /&gt;The client need only provide an implementation of &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/2/368.html"&gt;IReflector&lt;/a&gt;, which defines a callback-style interface completely describing the CLR's primitive types and providing you with an efficient ref pointer to the field's value for get/set purposes, and a FieldInfo instance providing access to the field's metadata:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public interface IReflector&lt;br /&gt;{&lt;br /&gt;    void Bool(ref bool field, FieldInfo info);&lt;br /&gt;    void Int16(ref short field, FieldInfo info);&lt;br /&gt;    ...&lt;br /&gt;    void Object&amp;lt;T&amp;gt;(ref T field, FieldInfo info);&lt;br /&gt;}&lt;/pre&gt;The compiler ensures that you handle every case in IReflector. You handle non-primitive objects in IReflector.Object&amp;lt;T&amp;gt;, by recursively calling DynamicType.Reflect(field, this, fieldInfo).&lt;br /&gt;&lt;br /&gt;Type&amp;lt;T&amp;gt; and DynamicType use lightweight code generation to implement a super-fast dispatch stub that invokes IReflector on each field of the object. These stubs are cached, so over time the overhead of reflection is near-zero. Contrast to the typical reflection overheads, and not only is this technique safer, it's significantly faster as well.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Extensible, Statically-Typed Turing-Complete Parsing&lt;/h3&gt;&lt;br /&gt;I covered the implementation of the &lt;a href="http://higherlogics.blogspot.com/2009/11/extensible-statically-typed-pratt.html"&gt;Pratt parser&lt;/a&gt; before, and the interface has changed only a little since then. Pratt parsing is intrinsically Turing complete, so you can parse literally any grammar. The predefined combinators are for context-free grammars, but you can easily inject custom parsing functions.&lt;br /&gt;&lt;br /&gt;What's more, each grammar you define is extensible in that you can inherit from and extend it in the way you would any other class. Here is a grammar from the unit tests for a simple calculator:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;abstract class MathSemantics&amp;lt;T&amp;gt; : Grammar&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;   public MathSemantics()&lt;br /&gt;   {&lt;br /&gt;       Infix("+", 10, Add);   Infix("-", 10, Sub);&lt;br /&gt;       Infix("*", 20, Mul);   Infix("/", 20, Div);&lt;br /&gt;       InfixR("^", 30, Pow);  Postfix("!", 30, Fact);&lt;br /&gt;       Prefix("-", 100, Neg); Prefix("+", 100, Pos);&lt;br /&gt;&lt;br /&gt;       Group("(", ")", int.MaxValue);&lt;br /&gt;       Match("(digit)", char.IsDigit, 1, Int);&lt;br /&gt;       SkipWhile(char.IsWhiteSpace);&lt;br /&gt;   }&lt;br /&gt;&lt;br /&gt;   protected abstract T Int(string lit);&lt;br /&gt;   protected abstract T Add(T lhs, T rhs);&lt;br /&gt;   protected abstract T Sub(T lhs, T rhs);&lt;br /&gt;   protected abstract T Mul(T lhs, T rhs);&lt;br /&gt;   protected abstract T Div(T lhs, T rhs);&lt;br /&gt;   protected abstract T Pow(T lhs, T rhs);&lt;br /&gt;   protected abstract T Neg(T arg);&lt;br /&gt;   protected abstract T Pos(T arg);&lt;br /&gt;   protected abstract T Fact(T arg);&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The unit tests then contain an implementation of the grammar which is an interpreter:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;sealed class MathInterpreter : MathSemantics&amp;lt;int&amp;gt;&lt;br /&gt;{&lt;br /&gt;   protected override int Int(string lit) { return int.Parse(lit); }&lt;br /&gt;   protected override int Add(int lhs, int rhs) { return lhs + rhs; }&lt;br /&gt;   protected override int Sub(int lhs, int rhs) { return lhs - rhs; }&lt;br /&gt;   protected override int Mul(int lhs, int rhs) { return lhs * rhs; }&lt;br /&gt;   protected override int Div(int lhs, int rhs) { return lhs / rhs; }&lt;br /&gt;   protected override int Pow(int lhs, int rhs) { return (int)Math.Pow(lhs, rhs); }&lt;br /&gt;   protected override int Neg(int arg) { return -arg; }&lt;br /&gt;   protected override int Pos(int arg) { return arg; }&lt;br /&gt;   protected override int Fact(int arg)&lt;br /&gt;   {&lt;br /&gt;       return arg == 0 || arg == 1 ? 1 : arg * Fact(arg - 1);&lt;br /&gt;   }&lt;br /&gt;}&lt;/pre&gt;Instead of interpreting directly, you could just as easily have created a parse tree.&lt;br /&gt;&lt;br /&gt;The tests also contain an extended grammar that inherits from MathSemantics and &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/tags/v0.9.3/Build/Tests/ParsingTests.cs?revision=570&amp;amp;view=markup"&gt;adds lexically-scoped variables&lt;/a&gt; (see EquationParser at the link):&lt;br /&gt;&lt;pre class="brush: csharp"&gt;&lt;br /&gt;sealed class EquationParser : MathSemantics&amp;lt;Exp&amp;gt;&lt;br /&gt;{&lt;br /&gt;   public EquationParser()&lt;br /&gt;   {&lt;br /&gt;       Match("(ident)", char.IsLetter, 0, name =&gt; new Var { Name = name });&lt;br /&gt;       TernaryPrefix("let", "=", "in", 90, Let);&lt;br /&gt;   }&lt;br /&gt;...&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;h3&gt;MIME Parsing&lt;/h3&gt;&lt;br /&gt;Sasa contains a simple stand-alone assembly &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/7.html"&gt;devoted to MIME types, file extensions&lt;/a&gt;, and functions mapping between the two (Sasa.Mime).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/8.html"&gt;The Sasa.Net.Mail namespace&lt;/a&gt; in the Sasa.Net assembly, contains functions for parsing instances of System.Net.Mail.MailMessage from strings, including attachments in every encoding I've come across in the past few years. This code has been in production use in an autonomous e-mail processing program which has processed tens of thousands of e-mails over many years, with very few bugs encountered.&lt;br /&gt;&lt;br /&gt;It can also format MailMessage instances into string form suitable for transmission over texty Internet protocols.&lt;br /&gt;&lt;h3&gt;Miscellaneous&lt;/h3&gt;&lt;br /&gt;The library is also &lt;a href="http://higherlogics.blogspot.com/2010/11/factoring-out-common-patterns-in.html"&gt;better factored&lt;/a&gt; than before, and has numerous handy &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/212.html"&gt;extensions to IEnumerable/LINQ&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/2/260.html"&gt;strings&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/208.html"&gt;numbers&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/239.html"&gt;tuples&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/223.html"&gt;endian encoding&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/214.html"&gt;url-safe binary encoding&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/209.html"&gt;some purely functional collections&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/93.html"&gt;easy file system manipulation&lt;/a&gt; (first described &lt;a href="http://higherlogics.blogspot.com/2009/11/easy-file-system-path-manipulation-in-c.html"&gt;here&lt;/a&gt;), &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/96.html"&gt;Stream extensions&lt;/a&gt; (including stream-to-stream copying), &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/92.html"&gt;endian-aware binary encoding&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/91.html"&gt;non-blocking futures&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/215.html"&gt;atomic exchange extensions&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/230.html"&gt;non-nullable types&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/228.html"&gt;lazy types&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/247.html"&gt;statically typed weak refs&lt;/a&gt;, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/1/216.html"&gt;code generation utilities&lt;/a&gt; (first described &lt;a href="http://higherlogics.blogspot.com/2010/05/clr-verification-for-runtime-code.html"&gt;here&lt;/a&gt;), &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/Contents/3/396.html"&gt;statistical functions&lt;/a&gt;, and much more.&lt;br /&gt;&lt;br /&gt;The full docs for this release are &lt;a href="http://higherlogics.net/sasa/docs-v0.9.3/"&gt;available online&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Deprecated&lt;/h3&gt;&lt;br /&gt;Unfortunately, the efficient, compact binary serializers from the last release have been deprecated, and the replacements based on Sasa.Dynamics are not yet ready. The ASP.NET page class that is immune to CSRF and clickjacking attacks &lt;a href="http://higherlogics.blogspot.com/2009/05/sasa-v09-released.html"&gt;first released in v0.9&lt;/a&gt; has been removed for now as well, since it depended on the compact binary serializer.&lt;br /&gt;&lt;br /&gt;I have plenty of new developments in the pipeline too. 2010 saw many interesting safety enhancement added to Sasa as outlined above, and 2011 will be an even more exciting year I assure you!&lt;br /&gt;&lt;br /&gt;Edit: the original download on sourceforge was missing the ilrewrite tool. That oversight has now been addressed.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-7032554156942798185?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/7032554156942798185/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=7032554156942798185' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7032554156942798185'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7032554156942798185'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/12/sasa-v093-released.html' title='Sasa v0.9.3 Released!'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-7296356028498602925</id><published>2010-11-26T21:43:00.000-05:00</published><updated>2011-09-26T02:05:45.764-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><category scheme='http://www.blogger.com/atom/ns#' term='reactive programming'/><title type='text'>Abstracting Computation: Arrows in C#</title><content type='html'>I &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa.Arrow/Arrow.cs?revision=558&amp;view=markup"&gt;just committed&lt;/a&gt; an implementation of &lt;a href="http://www.haskell.org/arrows/"&gt;Arrows&lt;/a&gt; to my open source &lt;a href="http://sf.net/projects/sasa"&gt;Sasa library&lt;/a&gt;. It's in its own dll and namespace, Sasa.Arrow, so it doesn't pollute the other production quality code.&lt;br /&gt;&lt;br /&gt;The implementation is pretty straightforward, and it also supports C#'s monadic query pattern, also known as LINQ. It basically boils down to implementing combinators on delegates, like Func&amp;lt;T, R&amp;gt;. It's not possible to implement the query pattern as extension methods for Func&amp;lt;T, R&amp;gt; because type inference fails for even the simplest of cases. So instead I wrapped Func&amp;lt;T, R&amp;gt; in a struct Arrow&amp;lt;T, R&amp;gt;, and implemented the query pattern as instance methods instead of extension methods. This removes a number of explicit type parameters that the inference engine struggles with, and type inference now succeeds.&lt;br /&gt;&lt;br /&gt;Of course, type inference still fails when calling Arrow.Return() on a static method, but this is a common and annoying failure of C#'s type inference [1].&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;What is this madness?&lt;/h2&gt;&lt;br /&gt;Some might question my sanity at this point, since arrows in C# are bound to be rather cumbersome. I have a specific application in mind however, and experimenting with that led naturally to arrows. Basically, while trying to use Rx.NET in a highly configurable and dynamic user interface library, I became dissatisfied with the state management required.&lt;br /&gt;&lt;br /&gt;In short, Rx.NET supports first-class signals, which does not play well with garbage collection. They solve this by reifying subscriptions in IDisposable objects that ensure proper cleanup if a signal is no longer required. So every time-varying value in my UI controls now requires me to keep track of two objects, the signal itself, and the IDisposable subscription to prevent it from being garbage collected.&lt;br /&gt;&lt;br /&gt;Now consider all the elements of a text box or data grid that may be changing over time, including the text font, the margins, the position, the background, and so on, and you quickly see the state management problem grow.&lt;br /&gt;&lt;br /&gt;Arrows can simplify this situation considerably, because instead of programming directly with signals, the user instead programs with &lt;em&gt;signal functions&lt;/em&gt;. Since signals are no longer first-class values, there is no garbage collection problem and no need to juggle subscriptions.&lt;br /&gt;&lt;br /&gt;There are further advantages in sharing, particularly for this UI library, so there's a great deal of incentive to use arrows. I'm hoping I can hide the use of arrows behind the user interface abstractions so the user has minimal interaction with it.&lt;br /&gt;&lt;br /&gt;[1] Consider:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;static int Id(int t)&lt;br /&gt;{&lt;br /&gt;  return i;&lt;br /&gt;}&lt;br /&gt;static Func&amp;lt;T, R&amp;gt; Return&amp;lt;T, R&amp;gt;(Func&amp;lt;T, R&amp;gt; f)&lt;br /&gt;{&lt;br /&gt;  return f;&lt;br /&gt;}&lt;/pre&gt;If you call Return(Id), type inference will fail.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-7296356028498602925?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/7296356028498602925/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=7296356028498602925' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7296356028498602925'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7296356028498602925'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/11/arrows-in-c.html' title='Abstracting Computation: Arrows in C#'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-6884979879967308325</id><published>2010-11-06T10:36:00.000-04:00</published><updated>2010-11-07T09:47:55.363-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>Factoring Out Common Patterns in Libraries</title><content type='html'>A well designed core library is essential for building concise, maintainable programs in any programming language. There are many common, recurrent patterns when writing code, and ideally, these recurring uses should be factored into their own abstractions that are distributed as widely as possible throughout the core library.&lt;br /&gt;&lt;br /&gt;Consider for instance, an "encapsulated value" pattern:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// A read-only reference to a value.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The type of the encapsulated value.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;public interface IValue&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt; /// &amp;lt;summary&amp;gt;&lt;br /&gt; /// A read-only reference to a value.&lt;br /&gt; /// &amp;lt;/summary&amp;gt;&lt;br /&gt; T Value { get; }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;This shows up everywhere, like &lt;a href="http://msdn.microsoft.com/en-us/library/b3h38hb0.aspx"&gt;Nullable&amp;lt;T&amp;gt;&lt;/a&gt;, &lt;a href="http://msdn.microsoft.com/en-us/library/dd642331.aspx"&gt;Lazy&amp;lt;T&amp;gt;&lt;/a&gt;, and &lt;a href="http://msdn.microsoft.com/en-us/library/dd321424.aspx"&gt;Task&amp;lt;T&amp;gt;&lt;/a&gt; (property called "Result"), &lt;a href="http://msdn.microsoft.com/en-us/library/dd321424.aspx"&gt;IEnumerator&amp;lt;T&amp;gt;&lt;/a&gt; (property called "Current"), and many many more.&lt;br /&gt;&lt;br /&gt;However, the common interface of a value encapsulated in an object has not been factored out into a common interface in the .NET Base Class Library (BCL). This means one cannot write programs that are agnostic over the type of a value's container, resulting in unnecessary code duplication.&lt;br /&gt;&lt;br /&gt;A legitimate argument against this approach is that the containers each have different semantics. For instance, accessing Lazy.Value will block until the value becomes available, but Nullable.Value always returns immediately.&lt;br /&gt;&lt;br /&gt;Fortunately, this is not an argument against factoring out the "encapsulated value" pattern, but an argument for another interface that exposes these semantics. In this case, the new pattern is an "optional value":&lt;br /&gt;&lt;pre class="brush:csharp"&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// A volatile value.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The type of value held in the reference.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;public interface IVolatile&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt; /// &amp;lt;summary&amp;gt;&lt;br /&gt; /// Attempt to extract the value.&lt;br /&gt; /// &amp;lt;/summary&amp;gt;&lt;br /&gt; /// &amp;lt;param name="value"&amp;gt;The value contained in the reference.&amp;lt;/param&amp;lt;&lt;br /&gt; /// &amp;lt;returns&amp;gt;True if the value was successfully retrieved, false otherwise.&amp;lt;/returns&amp;gt;&lt;br /&gt; bool TryGetValue(out T value);&lt;br /&gt;}&lt;br /&gt;public interface IOptional&amp;lt;T&amp;gt; : IValue&amp;lt;T&amp;gt;, IVolatile&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt; /// &amp;lt;summary&amp;gt;&lt;br /&gt; /// Returns true if a value is available.&lt;br /&gt; /// &amp;lt;/summary&amp;gt;&lt;br /&gt; bool HasValue { get; }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Lazy, Nullable and Task all exhibit these exact semantics. Programs can then be written that are agnostic over how optional values are encapsulated and processed, and the common interfaces ensure the different behaviours are overloaded in a consistent, familiar way.&lt;br /&gt;&lt;br /&gt;We can extend this even further to "mutable encapsulated values, aka, references":&lt;br /&gt;&lt;pre class="brush:csharp"&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// A mutable reference.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The type of value the reference contains.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;public interface IRef&amp;lt;T&amp;gt; : IValue&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt; /// &amp;lt;summary&amp;gt;&lt;br /&gt; /// The value in the reference.&lt;br /&gt; /// &amp;lt;/summary&amp;gt;&lt;br /&gt; new T Value { get; set; }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;This pattern is less common, but still quite prevalent. For instance, see &lt;a href="http://msdn.microsoft.com/en-us/library/dd642243.aspx"&gt;ThreadLocal&amp;lt;T&amp;gt;&lt;/a&gt; (which could also implement IOptional and IVolatile incidentally).&lt;br /&gt;&lt;br /&gt;These interfaces have been in the &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa/IRef.cs?revision=553&amp;amp;view=markup"&gt;Sasa&lt;/a&gt; library for quite some time, and are used consistently throughout the entire library. The consistency has helped considerably in guiding the design of new abstractions, and clarifying their use, since developers can simply understand any new abstraction in terms of the familiar interfaces it implements.&lt;br /&gt;&lt;br /&gt;I suppose the lesson to take from all this is to hunt down common patterns, and aggressively factor them out into reusable abstractions. This helps the library's consistency, thus helping clients learn your API by reducing the number of unnecessary new properties and methods.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-6884979879967308325?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/6884979879967308325/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=6884979879967308325' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6884979879967308325'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6884979879967308325'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/11/factoring-out-common-patterns-in.html' title='Factoring Out Common Patterns in Libraries'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-1493894630418434287</id><published>2010-09-23T21:06:00.000-04:00</published><updated>2011-09-26T01:40:13.296-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='C'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><title type='text'>High-level constructs for low-level C: exception handling, RAII, sum types and pattern matching</title><content type='html'>There are numerous high-level abstractions available in other languages that simply make programming easier and less error prone. For instance, automatic memory management, pattern matching, exceptions, higher order functions, and so on. Each of these features enable the developer to reason about program behaviour at a higher level, and factor out common behaviour into separate but composable units.&lt;br /&gt;&lt;br /&gt;For fun, I've create a few small macro headers that enable some of these patterns in pure C. If anyone sees any portability issues, please let me know!&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;libex: Exception Handling and RAII&lt;/h2&gt;&lt;br /&gt;RAII in C is definitely possible via a &lt;a href="http://vilimpoc.org/research/raii-in-c/"&gt;well-known pattern used everywhere in the Linux kernel&lt;/a&gt;. It's a great way to organize code, but the program logic and finalization and error logic are not syntactically apparent. You have to interpret the surrounding context to identify the error conditions, and when and how finalization is triggered.&lt;br /&gt;&lt;br /&gt;To address this, I encapsulated this RAII pattern in a macro library called &lt;a href="http://code.google.com/p/libex/"&gt;libex&lt;/a&gt;, with extensions to support arbitrary local exceptions, and a small set of pre-defined exception types. Currently, this just consists of more readable versions of the error codes in errno.h.&lt;br /&gt;&lt;br /&gt;No setjmp/longjmp is used, and libex provides little beyond case checking and finalization, because I wanted to provide a zero overhead exception handling and RAII that can supplant all uses of the undecorated pattern. Replacing all instances of the RAII pattern in Linux with these macro calls would incur little to no additional overhead, as it compiles down to a small number of direct branches.&lt;br /&gt;&lt;br /&gt;There are also some convenience macros for performing common checks, like MAYBE which checks for NULL, ERROR which checks for a non-zero value, etc.&lt;br /&gt;&lt;br /&gt;Example:&lt;br /&gt;&lt;pre class="brush: c"&gt;exc_type test(int i) {&lt;br /&gt;  THROWS(EUnrecoverable)&lt;br /&gt;  TRY(char *foo) {&lt;br /&gt;    MAYBE(foo = (char*)malloc(123), errno);&lt;br /&gt;  } IN {&lt;br /&gt;    // ... no exception was raised, so compute something with foo&lt;br /&gt;    // if EUnrecoverable thrown, it will propagate to caller&lt;br /&gt;    if (some_condition()) THROW(EUnrecoverable)&lt;br /&gt;  } HANDLE CATCH (EOutOfMemory) {&lt;br /&gt;    // ... handle error for foo&lt;br /&gt;  } CATCHANY {&lt;br /&gt;    // ... other errors?&lt;br /&gt;  } FINALLY {&lt;br /&gt;   // ... finalize any state that has already been allocated&lt;br /&gt;  }&lt;br /&gt;  DONE&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;There are a few small restrictions required for the full exception semantics to work, so please see the main page of &lt;a href="http://code.google.com/p/libex/"&gt;libex&lt;/a&gt; for further details.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;libsum: Pattern matching and sum types, aka disjoint/tagged/discriminated unions, aka variants&lt;/h2&gt;&lt;br /&gt;Functional languages have long enjoyed the succinct and natural construction and deconstruction of data structures via sum types and pattern matching. Now you can have some of that power in via a few simple macros:&lt;br /&gt;&lt;pre class="brush: c"&gt;/* declare a sum type and its constructor tags */&lt;br /&gt;SUM(foo) {&lt;br /&gt;  foo_one,&lt;br /&gt;  foo_two,&lt;br /&gt;};&lt;br /&gt;/* declare each sum case */&lt;br /&gt;CASE(foo, foo_one) { int i; char c; };&lt;br /&gt;CASE(foo, foo_two) { double d; };&lt;br /&gt;&lt;br /&gt;void do_bar(foo f) {&lt;br /&gt;  MATCH(f) {&lt;br /&gt;    AS(foo_one, y) printf("foo_one: %d, %c\n", y-&gt;i, y-&gt;c);&lt;br /&gt;    AS(foo_two, y) printf("foo_two: %d\n", y-&gt;d);&lt;br /&gt;    MATCHANY &lt;br /&gt;      fprintf(stderr, "No such case!");&lt;br /&gt;      exit(1);&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;int main(int argc, char** argv) {&lt;br /&gt;  foo f;&lt;br /&gt;  LET(f, foo_one, (3, 'g')); /* (3,'g') is an initializer */&lt;br /&gt;  do_bar(f);&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;There are a few small requirements and caveats, eg. LET performs dynamic memory allocation. Please see the main &lt;a href="http://code.google.com/p/libsum/"&gt;libsum page&lt;/a&gt; for further details.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;License&lt;/h2&gt;&lt;br /&gt;My default license is LGPL, but since these are macro libraries that's probably not appropriate choice, given there is no binary that can be replaced at runtime (one of the requirements of the LGPL). I like the freedoms afforded by the LGPL though, so I'm open to alternate suggestions with similar terms. I will also consider the MIT license if there are no viable alternatives.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-1493894630418434287?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/1493894630418434287/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=1493894630418434287' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1493894630418434287'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1493894630418434287'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/09/high-level-constructs-for-low-level-c.html' title='High-level constructs for low-level C: exception handling, RAII, sum types and pattern matching'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-4914203843042917932</id><published>2010-05-31T12:04:00.000-04:00</published><updated>2010-11-01T20:39:59.157-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='code generation'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>CLR: Verification for Runtime Code Generation</title><content type='html'>The CLR's lightweight code generation via &lt;a href="http://msdn.microsoft.com/en-us/library/system.reflection.emit.dynamicmethod.aspx"&gt;DynamicMethod&lt;/a&gt; is pretty useful, but it's sometimes difficult to debug the generated code and ensure that it verifies. In order to verify generated code, you must save the dynamic assembly to disk and run the peverify.exe tool on it, but DynamicMethod does not have any means to do so. In order to save the assembly, there's a more laborious process of creating dynamic assemblies, modules and types, and then finally adding a method to said type.&lt;br /&gt;&lt;br /&gt;This is further complicated by the fact that a &lt;a href="http://msdn.microsoft.com/en-us/library/system.reflection.emit.methodbuilder.aspx"&gt;MethodBuilder&lt;/a&gt; and DynamicMethod don't share any common interfaces or base types for generating IL, despite both of them &lt;a href="http://msdn.microsoft.com/en-us/library/7f4e970s.aspx"&gt;supporting a GetILGenerator() method&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;This difficulty in switching between saved codegen and pure runtime codegen led me to add a &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa/CodeGen.cs?view=markup"&gt;CodeGen&lt;/a&gt; class to Sasa, which can generate code for either case based on a bool parameter. Since no common interface is available for code generation, it also accepts a delegate to which it dispatches for generating the code:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// Create a dynamic method.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The type of the dynamic method to create.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;/// &amp;lt;param name="type"&amp;gt;The type to which this delegate should be a member.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;param name="methodName"&amp;gt;The name of the delegate's method.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;param name="attributes"&amp;gt;The method attributes.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;param name="saveAssembly"&amp;gt;Flag indicating whether the generated code should be saved to a dll.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;param name="generate"&amp;gt;A call back that performs the code generation.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;returns&amp;gt;An dynamically created instance of the given delegate type.&amp;lt;/returns&amp;gt;&lt;br /&gt;public static T Function&amp;lt;T&amp;gt;(&lt;br /&gt;  Type type,&lt;br /&gt;  string methodName,&lt;br /&gt;  MethodAttributes attributes,&lt;br /&gt;  bool saveAssembly,&lt;br /&gt;  Action&amp;lt;ILGenerator&amp;gt; generate)&lt;br /&gt;         where T : TypeConstraint&amp;lt;Delegate&amp;gt;;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;You can also see Sasa's ilrewrite tool at work here with the T : TypeConstraint&amp;lt;Delegate&amp;gt;. This function will generate either a DynamicMethod or a dynamic assembly and save that assembly to disk, based on the 'saveAssembly' parameter. The assembly name is generated based on the type and methodName parameters.&lt;br /&gt;&lt;br /&gt;In debugging the Sasa.Dynamics reflection code, I also came across a strange error which was not adequately explained anywhere that I could find. peverify.exe spit out an error to the effect of:&lt;br /&gt;&lt;pre&gt;[X.dll : Y/Z][offset 0x0000001D] Unable to resolve token&lt;/pre&gt;&lt;br /&gt;Where X is the name of the generated dll, Y the namespace path, and Z is the class name. In my case, this occurred when the dynamically generated code was referencing a private class, which should not be possible from a separate dll.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-4914203843042917932?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/4914203843042917932/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=4914203843042917932' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4914203843042917932'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4914203843042917932'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/05/clr-verification-for-runtime-code.html' title='CLR: Verification for Runtime Code Generation'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-3640666543570104179</id><published>2010-05-20T23:48:00.000-04:00</published><updated>2011-09-26T02:15:00.427-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Utilities'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>Peirce's Criterion: command-line tool</title><content type='html'>I've written a simple command-line tool filter out statistical outliers using the rigourous &lt;a href="http://mtp.jpl.nasa.gov/missions/start-08/science/piercescriterion.pdf"&gt;Peirce's Criterion&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The algorithm has been &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa.Statistics/Statistics.cs?view=markup"&gt;available in Sasa&lt;/a&gt; for awhile, and will be in the forthcoming v0.9.3 release.&lt;br /&gt;&lt;br /&gt;I've also packaged &lt;a href="http://higherlogics.net/software/peircefilter.zip"&gt;the command-line tool binary&lt;/a&gt; for running Peirce's Criterion over multi-column CSV files (LGPL source &lt;a href="http://higherlogics.net/src/peircefilter-src.zip"&gt;available here&lt;/a&gt;).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-3640666543570104179?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/3640666543570104179/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=3640666543570104179' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/3640666543570104179'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/3640666543570104179'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/05/peirces-criterion-command-line-tool.html' title='Peirce&apos;s Criterion: command-line tool'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8802702066501411810</id><published>2010-05-20T23:06:00.000-04:00</published><updated>2010-12-04T13:37:06.347-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='benchmarks'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>The Cost of Type.GetType()</title><content type='html'>Most framework-style software spends an appreciable amount of time dynamically loading code. Some of this code is executed quite frequently. I've recently been working on a web framework where URLs map to type names and methods, so I've been digging into these sort of patterns a great deal lately.&lt;br /&gt;&lt;br /&gt;The canonical means to map a type name to a System.Type instance is via System.Type.GetType(string). In a framework which performs a significant number of these lookups, it's not clear what sort of performance characteristics one can expect from this static framework function.&lt;br /&gt;&lt;br /&gt;Here's &lt;a href="http://higherlogics.net/src/TypeLoadTests-src.zip"&gt;the source for a simple test&lt;/a&gt; pitting Type.GetType() against a cache backed by a Dictionary&amp;lt;string, Type&amp;gt;. All tests were run on a Core 2 Duo 2.2 GHz, .NET CLR 3.5, and all numbers indicate the elapsed CPU ticks.&lt;br /&gt;&lt;table style="width: 100%;"&gt;&lt;tr&gt;&lt;th&gt;Type.GetType()&lt;/th&gt;&lt;th&gt;Dictionary&amp;lt;string, Type&amp;gt;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6236070640&lt;/td&gt;&lt;td&gt;51351056&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6236193856&lt;/td&gt;&lt;td&gt;51440360&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6237466224&lt;/td&gt;&lt;td&gt;51463192&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6238210488&lt;/td&gt;&lt;td&gt;51583336&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6240645816&lt;/td&gt;&lt;td&gt;51599480&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6242089400&lt;/td&gt;&lt;td&gt;51687448&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6244450392&lt;/td&gt;&lt;td&gt;51719808&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6245201664&lt;/td&gt;&lt;td&gt;51757472&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6248327048&lt;/td&gt;&lt;td&gt;51793696&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6249253736&lt;/td&gt;&lt;td&gt;51800056&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6250640672&lt;/td&gt;&lt;td&gt;51859704&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6251133912&lt;/td&gt;&lt;td&gt;51885992&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6253544768&lt;/td&gt;&lt;td&gt;51897264&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6254336632&lt;/td&gt;&lt;td&gt;51946408&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6255117872&lt;/td&gt;&lt;td&gt;52046512&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6256060648&lt;/td&gt;&lt;td&gt;52106936&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6256159176&lt;/td&gt;&lt;td&gt;52140984&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6259453568&lt;/td&gt;&lt;td&gt;52391000&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th colspan="2"&gt;Average&lt;/th&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;6247464250.67&lt;/td&gt;&lt;td&gt;51803928&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;Each program was run 20 times, and the resulting timing statistics were run through &lt;a href="http://higherlogics.blogspot.com/2010/05/peirces-criterion-command-line-tool.html"&gt;Peirce's Criterion&lt;/a&gt; to filter out statistical outliers.&lt;br /&gt;&lt;br /&gt;You can plainly see that using a static dictionary cache is over two orders of magnitude faster than going through GetType(). This is a huge savings when the number of lookups being performed is very high.&lt;br /&gt;&lt;br /&gt;Edit: Type.GetType is thread-safe, so I updated the test to verify that these performance numbers hold even when locking the dictionary. The dictionary is still two orders of magnitude faster. There would have to be significant lock contention in a concurrent program to justify using Type.GetType instead of a dictionary cache.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8802702066501411810?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8802702066501411810/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8802702066501411810' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8802702066501411810'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8802702066501411810'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/05/cost-of-typegettype.html' title='The Cost of Type.GetType()'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-272447912814995058</id><published>2010-05-20T20:05:00.000-04:00</published><updated>2010-12-04T13:38:54.719-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>LINQ Transpose Extension Method</title><content type='html'>I had recent need for a transpose operation which could swap the columns and rows of a nested IEnumerable sequence, it's simple enough to express in LINQ but after a quick search, all the solutions posted online are rather ugly. Here's a concise and elegant version expressed using LINQ query syntax:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// Swaps the rows and columns of a nested sequence.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The type of elements in the sequence.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;/// &amp;lt;param name="source"&amp;gt;The source sequence.&amp;lt;/param&amp;gt;&lt;br /&gt;/// &amp;lt;returns&amp;gt;A sequence whose rows and columns are swapped.&amp;lt;/returns&amp;gt;&lt;br /&gt;public static IEnumerable&amp;lt;IEnumerable&amp;lt;T&amp;gt;&amp;gt; Transpose&amp;lt;T&amp;gt;(&lt;br /&gt;         this IEnumerable&amp;lt;IEnumerable&amp;lt;T&amp;gt;&amp;gt; source)&lt;br /&gt;{&lt;br /&gt;    return from row in source&lt;br /&gt;           from col in row.Select(&lt;br /&gt;               (x, i) =&amp;gt; new KeyValuePair&amp;lt;int, T&amp;gt;(i, x))&lt;br /&gt;           group col.Value by col.Key into c&lt;br /&gt;           select c as IEnumerable&amp;lt;T&amp;gt;;&lt;br /&gt;}&lt;/pre&gt;It simply numbers the columns in each row, flattens the sequence of cells, and groups the entries by number. If the table has entries that are missing, this algorithm has the side-effect of compacting all entries so that only the last row or column will be missing the elements. This may or may not be suitable for your application.&lt;br /&gt;&lt;br /&gt;This extension method &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa/Linq/Enumerable.cs?view=markup"&gt;will be available&lt;/a&gt; in the forthcoming Sasa 0.9.3 release.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-272447912814995058?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/272447912814995058/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=272447912814995058' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/272447912814995058'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/272447912814995058'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/05/linq-transpose-extension-method.html' title='LINQ Transpose Extension Method'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-2445836365766186526</id><published>2010-05-17T22:17:00.001-04:00</published><updated>2011-09-26T01:41:51.740-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtual machines'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>Asymmetries in CIL</title><content type='html'>Most abstractions of interest have a natural dual, for instance, IEnumerable and IObservable, induction and co-induction, algebra and co-algebra, objects and algebraic data types, message-passing and pattern matching, etc.&lt;br /&gt;&lt;br /&gt;Programs are more concise and simpler when using the proper abstraction, be that some abstraction X or its dual. For instance, reactive programs written using the pull-based processing semantics of IEnumerable are far more unwieldy than those using the natural push-based semantics of IObsevable. As a further example, symbolic problems are more naturally expressed via pattern matching than via message-passing.&lt;br /&gt;&lt;br /&gt;This implies that any system is most useful when we ensure that every abstraction provided is accompanied by its dual. This also applies to virtual machine instruction sets like CIL, as I recently discovered while refining my &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa.Dynamics/Reflector.cs?view=markup"&gt;safe reflection abstraction&lt;/a&gt; for the CLR.&lt;br /&gt;&lt;br /&gt;A CIL instruction stream embeds some necessary metadata required for the CLR's correct operation. Usually this metadata is type information of some sort, such as the type of a local or the name and/or signature of a target method. For instance, here's a Hello World! example:&lt;br /&gt;&lt;pre&gt;ldstr "Hello World!"&lt;br /&gt;call  void [mscorlib]System.Console::WriteLine(string)&lt;/pre&gt;&lt;br /&gt;Unfortunately, the CIL instruction set suffers from some asymmetries which make some types of programming difficult.&lt;br /&gt;&lt;br /&gt;For example, the &lt;a href="http://msdn.microsoft.com/en-us/library/system.reflection.emit.opcodes.ldtoken.aspx"&gt;ldtoken&lt;/a&gt; instruction takes an embedded metadata token and pushes the corresponding runtime type handle onto the evaluation stack; this is the instruction executed when using the typeof() operator in C#.&lt;br /&gt;&lt;br /&gt;While this operation is useful, we sometimes want its dual, which is to say, we want the metadata used in a subsequent instruction to depend on the object or type handle at the top of the evaluation stack. A related operation is supported on the CLR, namely virtual dispatch, which depends on the concrete type, but dispatch is not general enough to support all of these scenarios because the vtable is immutable.&lt;br /&gt;&lt;br /&gt;Consider a scenario where you have an untyped object, like a reference to System.Object "obj", and you want to call into some generic code, like a method Foo&amp;lt;T&amp;gt;(T value), but pass the concrete type of obj for T, instead of System.Object. Currently, you must go through a laborious process where you call GetType() on the object to obtain it's type handle, then obtain the method handle via reflection or some &lt;a href="http://evain.net/blog/articles/2010/05/05/parameterof-propertyof-methodof"&gt;clever CIL hackery&lt;/a&gt;, then call MethodInfo.MakeGenericMethod in order to instantiate the type argument on Foo&amp;lt;T&amp;gt;, and finally, you must perform a dynamic invocation via reflection or allocate a custom delegate of type Action&amp;lt;T&amp;gt; and perform a virtual call, even though the call is statically known.&lt;br /&gt;&lt;br /&gt;Each step of this process is expensive, and it makes typeful programming on the CLR painful when working on the boundary between typed and untyped code. Many reflection problems, like serialization, become simpler once we're dealing with fully typeful code.&lt;br /&gt;&lt;br /&gt;Consider a dual instruction to ldtoken called "bind" which takes a type handle obtained via GetType() and then binds the resulting type handle into the executing instruction stream for the subsequent generic call to Foo&amp;lt;T&amp;gt;. This instruction could be easily and efficiently supported by any VM. Some restrictions are clearly needed for this instruction to remain verifiable, namely that the type constraints required by the target method are a subset of the type constraints of the statically known type, but the verifier already performs this analysis, and all of the information needed is available at the call site.&lt;br /&gt;&lt;br /&gt;Fully polymorphic functions like Foo&amp;lt;T&amp;gt; trivially satisfy this condition since it has no constraints whatsoever. Serialize&amp;lt;T&amp;gt;/Deserialize&amp;lt;T&amp;gt; operations are duals, and in fact exhibit exactly the same sort of fully polymorphic type signature as Foo&amp;lt;T&amp;gt;.&lt;br /&gt;&lt;br /&gt;There are many more programs that exhibit this structure, but they are unnecessarily difficult to write due to these asymmetries in CIL. This unfortunately requires developers to write a lot of ad-hoc code to get the results they want, and more code results in more bugs, more time, and more headaches.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-2445836365766186526?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/2445836365766186526/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=2445836365766186526' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2445836365766186526'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2445836365766186526'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/05/asymmetries-in-cil_17.html' title='Asymmetries in CIL'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-6622636443961196058</id><published>2010-05-17T17:10:00.000-04:00</published><updated>2011-09-26T01:42:05.189-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtual machines'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>CIL Verification and Safety</title><content type='html'>I've lamented &lt;a href="http://higherlogics.blogspot.com/2007/05/whats-wrong-with-net.html"&gt;here&lt;/a&gt; and &lt;a href="http://stackoverflow.com/questions/411906/c-net-design-flaws/1181366#1181366"&gt;elsewhere&lt;/a&gt; some unfortunate inconveniences and asymmetries in the CLR -- for example, we have nullable structs but lack non-nullable reference types, an issue &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa/NonNull.cs?view=markup"&gt;I address in my Sasa class library&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I've recently completed some Sasa abstractions for safe reflection, and an IL rewriter based on Mono.Cecil which allows C# source code to specify &lt;a href="http://msmvps.com/blogs/jon_skeet/archive/2009/09/10/generic-constraints-for-enums-and-delegates.aspx"&gt;type constraints&lt;/a&gt; that are supported by the CLR but unnecessarily restricted in C#. In the process, I came across another unjustified decision regarding verification: the &lt;a href="http://msdn.microsoft.com/en-us/library/system.reflection.emit.opcodes.jmp.aspx"&gt;jmp instruction&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The jmp instruction strikes me as potentially incredibly useful for alternative dispatch techniques, and yet I recently discovered that it's classified as unverifiable. This seems very odd, since the instruction is fully statically typed, and I can't think of a way its use could corrupt the VM.&lt;br /&gt;&lt;br /&gt;In short, the instruction performs a control transfer to a named method with a signature matching exactly the current method's signature, as long as the evaluation stack is empty and you are not currently in a try-catch block (see &lt;a href="http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-335.pdf"&gt;section 3.37 of the ECMA specification&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;This seems eminently verifiable given a simple control-flow analysis, an analysis which the verifier already performs to verify control-flow safety of some other verifiable instructions. If anyone can shed some light on this I would appreciate it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-6622636443961196058?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/6622636443961196058/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=6622636443961196058' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6622636443961196058'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6622636443961196058'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2010/05/cil-verification-and-safety.html' title='CIL Verification and Safety'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-6995165181151290523</id><published>2009-12-10T13:39:00.000-05:00</published><updated>2010-05-20T20:47:40.925-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='benchmarks'/><title type='text'>Purely Functional Collections vs. Mutable Collections</title><content type='html'>I've been experimenting with some purely functional collections, also known as persistent collections, and began to wonder precisely how much overhead they incur as compared to the standard framework collections on a VM platform such as .NET.&lt;br /&gt;&lt;br /&gt;My open source &lt;a href="http://sf.net/projects/sasa"&gt;Sasa&lt;/a&gt; library has long had a &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa/Collections/Set.cs?revision=295&amp;view=markup"&gt;persistent stack type&lt;/a&gt;, considered a list in functional languages, and I was considering adding some additional persistent collections. I therefore wanted a better understanding of the various costs involved.&lt;br /&gt;&lt;br /&gt;I had also wanted to test an interesting design alternative to standard class-based collections. One of the consistent nuisances on widely deployed VM platforms is the widespread presence of null values. C#'s extension methods somewhat mitigate this problem since you can call extension methods on null values and properly handle it, but calling instance methods on a null value throws a NullReferenceException. Therefore, you cannot invoke ToString or Equals without first checking for null.&lt;br /&gt;&lt;br /&gt;This is particularly annoying for data types where null is in fact a valid value, as it is in linked lists, where null denotes the empty list. The idea I had was to make the data type a struct, wrapping an inner reference that would hold the actual collection contents, basically moving the null into a non-nullable type that I control:&lt;pre class="brush: csharp"&gt;public struct Stack&amp;lt;T&amp;gt; : IEnumerable&amp;lt;T&amp;gt; {&lt;br /&gt;  Node top;&lt;br /&gt;&lt;br /&gt;  sealed class Node {&lt;br /&gt;    internal T value;&lt;br /&gt;    internal Node next;&lt;br /&gt;  }&lt;br /&gt;  public bool IsEmpty { get { return top == null; } }&lt;br /&gt;  public override string ToString() {&lt;br /&gt;    return IsEmpty ? "&amp;lt;empty&amp;gt;" : this.Format(" &amp; ");&lt;br /&gt;  }&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;A struct type can never be null, so I can overload Equals and ToString behaviour, but still handle null as a valid value -- a null inner reference indicates an empty list, for example.&lt;br /&gt;&lt;br /&gt;I copied over the Sasa Seq type, implemented a struct-based version of it, and similarly implemented two persistent versions of a queue type. I then ran some quick benchmarks against the system collection classes, List&amp;lt;T&amp;gt;, Stack&amp;lt;T&amp;gt; and Queue&amp;lt;T&amp;gt; using a random sequence of 10,000,000 enqueues and dequeues. This test was repeated 10 times in each run, and the average runtime and memory use was taken.&lt;br /&gt;&lt;br /&gt;This process was then repeated 13 times for each collection type, and I threw away the top and bottom two values. The results follow, where all values are in CPU ticks measured by System.Diagnostics.StopWatch. The results are displayed in sorted order, with the fastest at the top, and slowest at the bottom.&lt;br /&gt;&lt;h2&gt;Queue Operations&lt;/h2&gt;&lt;br /&gt;Legend:&lt;ul&gt;&lt;li&gt;PQueue&amp;lt;T&amp;gt;: persistent class-based queue.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;LinkedList&amp;lt;T&amp;gt;: System.Collections.Generic.LinkedList.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;PQueue2&amp;lt;T&amp;gt;: persistent struct-based queue.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Queue&amp;lt;T&amp;gt;: System.Collections.Generic.Queue.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;table style="width: 100%"&gt;&lt;tr&gt;&lt;th&gt;PQueue&amp;lt;T&amp;gt;&lt;/th&gt;&lt;th&gt;LinkedList&amp;lt;T&amp;gt;&lt;/th&gt;&lt;th&gt;PQueue2&amp;lt;T&amp;gt;&lt;/th&gt;&lt;th&gt;Queue&amp;lt;T&amp;gt;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7699324462.55&lt;/td&gt;&lt;td&gt;7890322380.36&lt;/td&gt;&lt;td&gt;7679249692.36&lt;/td&gt;&lt;td&gt;7761780264&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7718837154.91&lt;/td&gt;&lt;td&gt;7983445563.64&lt;/td&gt;&lt;td&gt;7687912510.55&lt;/td&gt;&lt;td&gt;7803131562.18&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7740207043.64&lt;/td&gt;&lt;td&gt;8023998067.64&lt;/td&gt;&lt;td&gt;7703769050.18&lt;/td&gt;&lt;td&gt;7810452602.91&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7828445642.91&lt;/td&gt;&lt;td&gt;8032055720.73&lt;/td&gt;&lt;td&gt;7800245839.27&lt;/td&gt;&lt;td&gt;7811040914.91&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8451346377.45&lt;/td&gt;&lt;td&gt;8201663652.36&lt;/td&gt;&lt;td&gt;7834188839.27&lt;/td&gt;&lt;td&gt;7850104080&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8589153123.64&lt;/td&gt;&lt;td&gt;8282536765.09&lt;/td&gt;&lt;td&gt;7955433135.27&lt;/td&gt;&lt;td&gt;7935923176&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8707905445.82&lt;/td&gt;&lt;td&gt;8484668378.91&lt;/td&gt;&lt;td&gt;8178008811.64&lt;/td&gt;&lt;td&gt;8118527255.27&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8764140750.55&lt;/td&gt;&lt;td&gt;8742958103.27&lt;/td&gt;&lt;td&gt;8323106361.45&lt;/td&gt;&lt;td&gt;8348756291.64&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8979406774.55&lt;/td&gt;&lt;td&gt;8752582049.45&lt;/td&gt;&lt;td&gt;8327369725.09&lt;/td&gt;&lt;td&gt;8350023904&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th style="text-align:left" colspan="4"&gt;Averages&lt;th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8275418530.67&lt;/td&gt;&lt;td&gt;8266025631.27&lt;/td&gt;&lt;td&gt;7943253773.9&lt;/td&gt;&lt;td&gt;7976637783.43&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;h2&gt;Stack Operations&lt;/h2&gt;&lt;br /&gt;Legend:&lt;ul&gt;&lt;li&gt;Seq&amp;lt;T&amp;gt;: persistent class-based stack.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;List&amp;lt;T&amp;gt;: System.Collections.Generic.List.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Seq2&amp;lt;T&amp;gt;: persistent struct-based stack.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Stack&amp;lt;T&amp;gt;: System.Collections.Generic.Stack.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;table style="width: 100%"&gt;&lt;tr&gt;&lt;th&gt;List&amp;lt;T&amp;gt;&lt;/th&gt;&lt;th&gt;Seq&amp;lt;T&amp;gt;&lt;/th&gt;&lt;th&gt;Stack&amp;lt;T&amp;gt;&lt;/th&gt;&lt;th&gt;Seq2&amp;lt;T&amp;gt;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7710263064&lt;/td&gt;&lt;td&gt;7826307262.55&lt;/td&gt;&lt;td&gt;7802069434.91&lt;/td&gt;&lt;td&gt;7690359998.55&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7762791045.82&lt;/td&gt;&lt;td&gt;7870793184&lt;/td&gt;&lt;td&gt;7808408813.82&lt;/td&gt;&lt;td&gt;7770721440.73&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7849427981.09&lt;/td&gt;&lt;td&gt;8035572504.73&lt;/td&gt;&lt;td&gt;7840059360&lt;/td&gt;&lt;td&gt;7989185347.64&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7905690538.18&lt;/td&gt;&lt;td&gt;8099787029.09&lt;/td&gt;&lt;td&gt;7883823349.09&lt;/td&gt;&lt;td&gt;8201203589.82&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7912993893.82&lt;/td&gt;&lt;td&gt;8315915441.45&lt;/td&gt;&lt;td&gt;7907107597.82&lt;/td&gt;&lt;td&gt;8208994094.55&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7947893535.27&lt;/td&gt;&lt;td&gt;8349995466.18&lt;/td&gt;&lt;td&gt;7914102997.82&lt;/td&gt;&lt;td&gt;8239609125.82&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7952697770.18&lt;/td&gt;&lt;td&gt;8396224841.45&lt;/td&gt;&lt;td&gt;7927640684.36&lt;/td&gt;&lt;td&gt;8352575028.36&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8131449759.27&lt;/td&gt;&lt;td&gt;8694619845.82&lt;/td&gt;&lt;td&gt;8155345951.27&lt;/td&gt;&lt;td&gt;8376635294.55&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8388583725.09&lt;/td&gt;&lt;td&gt;8738260760&lt;/td&gt;&lt;td&gt;8294985193.45&lt;/td&gt;&lt;td&gt;8540294446.55&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th style="text-align:left" colspan="4"&gt;Averages&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7951310145.86&lt;/td&gt;&lt;td&gt;8258608481.7&lt;/td&gt;&lt;td&gt;7948171486.95&lt;/td&gt;&lt;td&gt;8152175374.06&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;h2&gt;Analysis&lt;/h2&gt;&lt;br /&gt;The persistent collections perform rather well, generally within 5% of their mutable counterparts given this set of test data. While I didn't calculate it, you can see from the variance in the runs that there's a wider standard deviation from the mean for persistent collections, so their performance is ever so slightly less predictable, though not drastically so.&lt;br /&gt;&lt;br /&gt;Interestingly enough, the struct-based persistent collections outperformed the class-based versions. I expected the reverse considering struct operations are not always properly optimized by the JIT. Even though the entire struct would fit into a register, the JIT may still allocate a stack slot for it, which would be more expensive than the guaranteed register-sized operations of a class type. Upon further thought, I suspected that perhaps the struct versions are faster simply because the VM doesn't need to perform a null check on dispatch, but the class-based versions use all-static method calls which don't perform null checks as far as I know.&lt;br /&gt;&lt;br /&gt;I don't yet have a good explanation for this, but given the results, I believe I will move all Sasa collections to struct implementations as it simply provides more flexibility, immunity from null errors, and no appreciable runtime overhead.&lt;br /&gt;&lt;br /&gt;If I had more time I would make these tests a little more rigourous by varying the test vectors in a more controlled fashion instead of just random, ie. use stepped, random, and other types of enqueue/dequeue sequences, to determine exactly how persistent and mutable collections behave based on inputs. I suspect the performance profiles will differ more drastically in such different scenarios.&lt;br /&gt;&lt;br /&gt;This test was enough to demonstrate to me that persistent collections are sufficiently performant to be used in daily code, particularly given their numerous other advantages.&lt;br /&gt;&lt;br /&gt;All source code and raw timing/memory tables can be downloaded &lt;a href="http://higherlogics.net/tests/CollectionBenchmarks.zip"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-6995165181151290523?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/6995165181151290523/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=6995165181151290523' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6995165181151290523'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6995165181151290523'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/12/purely-functional-collections-vs.html' title='Purely Functional Collections vs. Mutable Collections'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-193889219313572269</id><published>2009-11-18T22:39:00.000-05:00</published><updated>2011-09-26T01:52:24.419-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><title type='text'>Easy File System Path Manipulation in C#</title><content type='html'>I came across &lt;a href="http://hackage.haskell.org/package/pathtype"&gt;this type-safe module for handling file paths&lt;/a&gt; in the &lt;a href="http://www.reddit.com/r/haskell/comments/a4cav/pathtype_typesafe_replacement_for_systemfilepath/"&gt;Haskell subreddit&lt;/a&gt; this week, and thought it looked kind of neat. Handling paths as strings, even with System.IO.Path always bugged me.&lt;br /&gt;&lt;br /&gt;So I &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa/IO/Paths.cs?view=markup"&gt;created a close C# equivalent of the Haskell type&lt;/a&gt; and added it to the &lt;a href="http://sourceforge.net/projects/sasa/"&gt;Sasa library&lt;/a&gt;, to be available in the upcoming v0.9.3 release:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public struct FsPath : IEnumerable&amp;lt;string&amp;gt;, IEnumerable&lt;br /&gt;{&lt;br /&gt; public FsPath(IEnumerable&amp;lt;string&amp;gt; parts);&lt;br /&gt; public FsPath(string path);&lt;br /&gt;&lt;br /&gt; public static FsPath operator /(FsPath path1, FsPath path2);&lt;br /&gt; public static FsPath operator /(FsPath path, IEnumerable&amp;lt;string&amp;gt; parts);&lt;br /&gt; public static FsPath operator /(FsPath path, string part);&lt;br /&gt; public static FsPath operator /(FsPath path, string[] parts);&lt;br /&gt; public static FsPath operator /(IEnumerable&amp;lt;string&amp;gt; parts, FsPath path);&lt;br /&gt; public static FsPath operator /(string part, FsPath path);&lt;br /&gt; public static FsPath operator /(string[] parts, FsPath path);&lt;br /&gt; public static implicit operator FsPath(string path);&lt;br /&gt;&lt;br /&gt; public FsPath Combine(FsPath file);&lt;br /&gt; public IEnumerator&amp;lt;string&amp;gt; GetEnumerator();&lt;br /&gt; public static FsPath Root(string root);&lt;br /&gt; public override string ToString();&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The above is just the interface, minus the comments which you can see at the above svn link. This struct tracks path components for you and C#'s operator overloading lets you specify paths declaratively without worrying about combining paths with proper separator characters, etc.:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;FsPath root = "/foo/bar";&lt;br /&gt;var baz = root / "blam" / "baz";&lt;br /&gt;var etc = FsPath.Root("/etc/");&lt;br /&gt;var passwd = etc / "passwd";&lt;/pre&gt;&lt;br /&gt;The library will generate path strings using the platform's directory separator.&lt;br /&gt;&lt;br /&gt;One significant departure from the Haskell library is the lack of phantom types used to distinguish the various combinations of relative/absolute and file/directory paths. C# can express these same constraints, but I intentionally left them out for two reasons:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The type safety provided by the file/directory phantom type is rather limited, since it's only an alleged file/directory path; you have to consult the file system to determine whether the alleged type is actually true. The only minor advantage to specifying this in a path type, is as a form of documentation to users of your library that you expect a file path in a particular method, and not a directory path. I could be convinced to add it for this reason.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The relative/absolute phantom type seems a bit useless to me, though the reason may be less obvious to others. IMO, the set of file and directory info classes should not have GetParent() or support ".." directory navigation operations, nor should they support static privilege escalation operations like File.GetAbsolutePath() which ambiently converts a string into an authority on a file/directory, since this inhibits subsystem isolation; without GetParent() or privilege escalation functions, any subsystem you hand a directory object is automatically chroot jailed to operate only in that directory. This has long been known to the capability-security community, and is how they structure all of their IO libraries (see &lt;a href="http://code.google.com/p/joe-e/"&gt;Joe-E&lt;/a&gt; and the E programming language). Thus, in a capability-secure file system library, all paths are relative paths interpreted relative to a given directory object. As a result, FsPath also strips all "." and resolve all ".." to within the provided path to supply this isolation.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;It goes without saying that .NET does not currently handle files and directories this way, but I plan to handle this eventually as well. FsPath will play a significant role in ensuring that paths cannot escape the chroot jail.&lt;br /&gt;&lt;br /&gt;As an interim step along that path, the FsPath interface is a good first step.&lt;br /&gt;&lt;br /&gt;[Edit: the reddit thread for this post has &lt;a href="http://www.reddit.com/r/programming/comments/a5wl0/easy_file_system_path_manipulation_in_c/"&gt;some good discussion&lt;/a&gt;.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-193889219313572269?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/193889219313572269/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=193889219313572269' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/193889219313572269'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/193889219313572269'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/11/easy-file-system-path-manipulation-in-c.html' title='Easy File System Path Manipulation in C#'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-298516607488080647</id><published>2009-11-16T09:41:00.000-05:00</published><updated>2011-09-26T01:59:49.666-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><title type='text'>Extensible, Statically Typed Pratt Parser in C#</title><content type='html'>I just completed a statically typed Pratt-style single-state extensible lexer+parser, otherwise known as a top-down operator precedence parser, for the &lt;a href="http://sf.net/projects/sasa/"&gt;Sasa library&lt;/a&gt;. The implementation is available in the Sasa.Parsing dll, &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa.Parsing/Pratt/PrattParser.cs?view=markup"&gt;under Sasa.Parsing.Pratt&lt;/a&gt;. Two simple arithmetic calculators are available in &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Build/Tests/ParsingTests.cs?view=markup"&gt;the unit tests&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;This implementation is novel in two ways: &lt;br /&gt;&lt;ol&gt;&lt;li&gt;Aside from an alleged &lt;a href="http://www.dmitry-kazakov.de/ada/components.htm#Parsers_etc"&gt;implementation in Ada&lt;/a&gt;, this is the only statically typed Pratt parser I'm aware of.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Pratt parsers typically require a pre-tokenized input before parsing semantic tokens, but I've eliminated this step by using the symbol definitions to drive a longest-match priority-based lexer.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;Symbols by default match themselves, but you can optionally provide a scanning function used to match arbitrary string patterns. The symbol selected for the current position of the input is the symbol that matches the longest substring. If two symbols match equally, then the symbol with higher precedence is selected.&lt;br /&gt;&lt;br /&gt;The design was heavily influenced by &lt;a href="http://effbot.org/zone/tdop-index.htm"&gt;this Python implementation&lt;/a&gt;, though numerous departures were made to restore static typing and integrate lexing. In particular, symbols and semantic tokens are separate, where symbols are used primarily during lexing, and construct the appropriate semantic tokens on match.&lt;br /&gt;&lt;br /&gt;Here is a simple arithmetic calculator:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;class Calculator : PrattParser&amp;lt;int&amp;gt;&lt;br /&gt;{&lt;br /&gt; public Calculator()&lt;br /&gt; {&lt;br /&gt;     Infix("+", 10, Add);   Infix("-", 10, Sub);&lt;br /&gt;     Infix("*", 20, Mul);   Infix("/", 20, Div);&lt;br /&gt;     InfixR("^", 30, Pow);  Postfix("!", 30, Fact);&lt;br /&gt;     Prefix("-", 100, Neg); Prefix("+", 100, Pos);&lt;br /&gt;&lt;br /&gt;     // grouping symbols, ie. "(_)"&lt;br /&gt;     Group("(", ")", int.MaxValue);&lt;br /&gt;&lt;br /&gt;     // provide a predicate to identify valid literals&lt;br /&gt;     Match("(digit)", char.IsDigit, 1, Int);&lt;br /&gt;&lt;br /&gt;     // ignore whitespace&lt;br /&gt;     SkipWhile(char.IsWhiteSpace);&lt;br /&gt; }&lt;br /&gt;&lt;br /&gt; int Int(string lit) { return int.Parse(lit); }&lt;br /&gt; int Add(int lhs, int rhs) { return lhs + rhs; }&lt;br /&gt; int Sub(int lhs, int rhs) { return lhs - rhs; }&lt;br /&gt; int Mul(int lhs, int rhs) { return lhs * rhs; }&lt;br /&gt; int Div(int lhs, int rhs) { return lhs / rhs; }&lt;br /&gt; int Pow(int lhs, int rhs) { return (int)Math.Pow(lhs, rhs); }&lt;br /&gt; int Neg(int arg) { return -arg; }&lt;br /&gt; int Pos(int arg) { return arg; }&lt;br /&gt; int Fact(int arg)&lt;br /&gt; {&lt;br /&gt;     return arg == 0 || arg == 1 ? 1 : arg * Fact(arg - 1);&lt;br /&gt; }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The above is a simplified version of the calculators used in the unit tests. You can clearly see the definition of left and right-associative infix operators, and postfix operators. The numbers are precedence levels, and the delegates are the semantic actions associated with each token type.&lt;br /&gt;&lt;br /&gt;There are pre-defined functions for creating left and right-associative infix, prefix, postfix, and both prefix and infix ternary operators -- prefix ternary is a simple "if _ then _ else _", infix ternary is like C's ternary operation, "_ ? _ : _". You are not constrained to these symbol types however, as you can define arbitrary character sequences as symbols and parse the results at your whim.&lt;br /&gt;&lt;br /&gt;Adding variable binding to the above calculator requires a few extensions to support lexically scoped environments, but the parser requires only the following two extensions:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;// symbol for identifiers, ie. variable names&lt;br /&gt;Match("(ident)", char.IsLetter, 0,&lt;br /&gt;        name =&gt; new Var { Name = name });&lt;br /&gt;// let binding, ie. "let x = 1 in x", is a&lt;br /&gt;// prefix-ternary operator&lt;br /&gt;TernaryPrefix("let", "=", "in", 90, Let);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;"let" is not matched as an identifier because the "let" symbol has a higher precedence. Parser definitions with ambiguous parses thus require some care to resolve the ambiguity either via match length or precedence relationships.&lt;br /&gt;&lt;br /&gt;There are two popular choices when it comes to parsing: parser generator tools and parser combinator libraries. I've never liked generators personally, as they require a separate language and entirely separate tools to define and process the grammar thus increasing the learning and maintenance curve.&lt;br /&gt;&lt;br /&gt;I actually like parser combinators, and there are now good techniques for supporting arbitrary left recursion in grammars (see Packrat parsing in OMeta). However, handling operator precedence often involves a series of complex extensions to the otherwise simple recursive descent, or it requires the parsed results to maintain all operator and grouping information so a post-processing phase can readjust the parse tree to reflect the proper priorities. The post-processing phase typically uses a shunting yard or Pratt-style operator precedence algorithm to resolve precedence, so why not just build the parser itself directly on Pratt parsing?&lt;br /&gt;&lt;br /&gt;Unlike parser-combinator libraries, of which I've built a few, Pratt parsers natively support arbitrary operator precedence and are efficient as they require no backtracking. Pratt parsers are also Turing-complete, because the semantic actions can perform arbitrary computations on the parser input. Turing completeness carries both benefits and costs of course, but the single-state nature of Pratt parsers keeps it manageable. The "single state" of the parser is simply the current token matched from the input.&lt;br /&gt;&lt;br /&gt;For example, if prefix-ternary operators weren't already available, you could simply define one like so:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;// "let" is a prefix symbol on a name&lt;br /&gt;var let = Prefix("let", 90, x =&gt;&lt;br /&gt;{&lt;br /&gt;    // lexer matched "let" prefix operator on x&lt;br /&gt;    // argument, a variable name&lt;br /&gt;    Advance("=");          // advance past the "="&lt;br /&gt;    var xValue = Parse(0); // parse the value to bind to x&lt;br /&gt;    Advance("in");         // advance past the "in"&lt;br /&gt;    var body = Parse(0);   // parse the let expression body&lt;br /&gt;    // return a Let expression node&lt;br /&gt;    return new Let{ Variable = x, Value = xValue, Body = body};&lt;br /&gt;});&lt;br /&gt;// declare the other symbols involved for the lexer&lt;br /&gt;// but they have no associated semantic action&lt;br /&gt;Symbol("=");&lt;br /&gt;Symbol("in");&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The above builds on "Prefix", but you could break this down even further as in Pratt's original paper and define arbitrary symbol behaviour by providing a scanning function, "left denotation", and "null denotation", and so on.&lt;br /&gt;&lt;br /&gt;To define a parser, you need only inherit from Sasa.Parsing.Pratt.PrattParser, provide the semantic type T, and override the constructor for your parser to define your symbol table. The only downside is that each parser instance is monolithic and not thread-safe, so it can only parse one input at a time. There is no limit to how many parsers you can create however.&lt;br /&gt;&lt;br /&gt;Pratt parsing is simple, compact, and efficient, so I hope it finds more widespread use, particularly now that there is a statically typed implementation for a popular language.&lt;br /&gt;&lt;br /&gt;[Edit: the PrattParser will be available in the upcoming v0.9.3 release of the Sasa class library.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-298516607488080647?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/298516607488080647/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=298516607488080647' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/298516607488080647'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/298516607488080647'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/11/extensible-statically-typed-pratt.html' title='Extensible, Statically Typed Pratt Parser in C#'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8224383235054475123</id><published>2009-10-01T11:50:00.000-04:00</published><updated>2011-09-26T01:57:29.847-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='tagless interpreters'/><category scheme='http://www.blogger.com/atom/ns#' term='EDSL'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>Abstracting over Type Constructors using Dynamics in C#</title><content type='html'>I've written quite a few times about my experiments with the CLR type system [&lt;a href="http://higherlogics.blogspot.com/search/label/modules"&gt;1&lt;/a&gt;, &lt;a href="http://higherlogics.blogspot.com/search/label/tagless%20interpreters"&gt;2&lt;/a&gt;, &lt;a href="http://higherlogics.blogspot.com/search/label/generics"&gt;3&lt;/a&gt;]. After much exploration and reflection, I had devised a pretty good approach to encoding ML-style modules and abstracting over type constructors in C#.&lt;br /&gt;&lt;br /&gt;A &lt;a href="http://stackoverflow.com/questions/1426239/can-ml-functors-be-fully-encoded-in-net-c-f/1466226"&gt;recent question on Stack Overflow&lt;/a&gt; made me realize that I never actually explained this technique in plain English.&lt;br /&gt;&lt;br /&gt;The best encoding of ML modules and type constructor polymorphism requires the use of partly safe casting.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;An ML signature maps to a C# interface with a generic type parameter called a "brand". The brand names the class that implements the interface, ie. the module implementation.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;An ML module maps to a C# class. If the module implements a signature, then it implements the corresponding interface and specifies itself as the signature's brand.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Since classes and interfaces are first-class values, an ML functor also maps to a class.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;An ML type component maps to an abstract class that shares the same brand as the module. This effectively ties the the module data representation and the module implementation together at the interface boundary, and makes the necessary casting partly safe.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;I'll use the tagless interpreter from Fig. 2 of &lt;a href="http://lambda-the-ultimate.org/node/2438"&gt;tagless staged interpreters&lt;/a&gt; as a concrete example:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;(* Fig. 2 *)&lt;br /&gt;module type Symantics = sig&lt;br /&gt;  type ('c, 'dv) repr&lt;br /&gt;  val int : int -&amp;gt; ('c, int) repr&lt;br /&gt;  val bool: bool -&amp;gt; ('c, bool) repr&lt;br /&gt;  val lam : (('c, 'da) repr -&amp;gt; ('c, 'db) repr) -&amp;gt;&lt;br /&gt;            ('c, 'da -&amp;gt; 'db) repr&lt;br /&gt;  val app : ('c, 'da -&amp;gt; 'db) repr -&amp;gt; ('c, 'da) repr -&amp;gt;&lt;br /&gt;            ('c, 'db) repr&lt;br /&gt;  val fix : ('x -&amp;gt; 'x) -&amp;gt; (('c, 'da -&amp;gt; 'db) repr as 'x)&lt;br /&gt;  val add : ('c, int) repr -&amp;gt; ('c, int) repr -&amp;gt;&lt;br /&gt;            ('c, int) repr&lt;br /&gt;  val mul : ('c, int) repr -&amp;gt; ('c, int) repr -&amp;gt;&lt;br /&gt;            ('c, int) repr&lt;br /&gt;  val leq : ('c, int) repr -&amp;gt; ('c, int) repr -&amp;gt;&lt;br /&gt;            ('c, bool) repr&lt;br /&gt;  val if_ : ('c, bool) repr -&amp;gt;&lt;br /&gt;            (unit -&amp;gt; 'x) -&amp;gt; (unit -&amp;gt; 'x) -&amp;gt;&lt;br /&gt;            (('c, 'da) repr as 'x)&lt;br /&gt;end&lt;/pre&gt;&lt;br /&gt;In the translation, I omit the 'c type parameter used in OCaml. The type of the expression representation, 'dv, becomes T in C#:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;pre class="brush: csharp"&gt;module type Symantics = sig&lt;/pre&gt;maps to&lt;pre  class="brush: csharp"&gt;interface ISymantics&amp;lt;B&amp;gt; where B : ISymantics&amp;lt;B&amp;gt;&lt;/pre&gt; (B is the module's Brand)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;pre class="brush: csharp"&gt;type ('c, 'dv) repr&lt;/pre&gt;maps to&lt;pre class="brush: csharp"&gt;abstract class Repr&amp;lt;T, B&amp;gt; where B : ISymantics&amp;lt;B&amp;gt;&lt;/pre&gt; (B is the module's Brand)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Each signature function maps to a method on ISymantics, ie.&lt;br /&gt;&lt;pre class="brush: csharp"&gt;val int : int -&amp;gt; ('c, int) repr&lt;/pre&gt;maps to&lt;pre class="brush: csharp"&gt;Repr&amp;lt;int, B&amp;gt; Int(int value)&lt;/pre&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;The final translation will look something like:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;// type component&lt;br /&gt;abstract class Repr&amp;lt;T, B&amp;gt; where B : ISymantics&amp;lt;B&amp;gt; { }&lt;br /&gt;// module signature&lt;br /&gt;interface ISymantics&amp;lt;B&amp;gt; where B : ISymantics&amp;lt;B&amp;gt;&lt;br /&gt;{&lt;br /&gt;  Repr&amp;lt;int, B&amp;gt; Int(int i);&lt;br /&gt;  Repr&amp;lt;int, B&amp;gt; Add(Repr&amp;lt;int, B&amp;gt; left, Repr&amp;lt;int, B&amp;gt; right);&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The implementation undergoes a similar translation:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;pre class="brush: csharp"&gt;module R = struct&lt;/pre&gt;maps to&lt;pre class="brush: csharp"&gt;sealed class R : ISymantics&amp;lt;R&amp;gt;&lt;/pre&gt;(R implements ISymantics and provides itself as the type brand)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;pre class="brush: csharp"&gt;type ('c,'dv) repr = 'dv&lt;/pre&gt;maps to&lt;pre class="brush: csharp"&gt;sealed class ReprR&amp;lt;T&amp;gt; : Repr&amp;lt;T, R&amp;gt;&lt;/pre&gt;(the concrete representation is a sealed class that inherits from Repr, and supplies R as the brand, effectively typing it to the R implementation)&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;The final mapping looks like:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;(* Section 2.2 *)&lt;br /&gt;module R = struct&lt;br /&gt;  type ('c,'dv) repr = 'dv (* no wrappers *)&lt;br /&gt;  let int (x:int) = x&lt;br /&gt;  let add e1 e2 = e1 + e2&lt;br /&gt;  ...&lt;br /&gt;end&lt;/pre&gt;maps to:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;// concrete type component for the interpreter&lt;br /&gt;// representation&lt;br /&gt;sealed class ReprR&amp;lt;T&amp;gt; : Repr&amp;lt;T, R&amp;gt;&lt;br /&gt;{&lt;br /&gt;  internal T value;&lt;br /&gt;}&lt;br /&gt;sealed class R : ISymantics&amp;lt;R&amp;gt;&lt;br /&gt;{&lt;br /&gt;  public Repr&amp;lt;int, R&amp;gt; Int(int i)&lt;br /&gt;  {&lt;br /&gt;    return new ReprR&amp;lt;int&amp;gt; { value = i };&lt;br /&gt;  }&lt;br /&gt;  public Repr&amp;lt;int, R&amp;gt; Add(Repr&amp;lt;int, R&amp;gt; left,&lt;br /&gt;                         Repr&amp;lt;int, R&amp;gt; right)&lt;br /&gt;  {&lt;br /&gt;    var l = left as ReprR&amp;lt;int&amp;gt;; // semi-safe cast&lt;br /&gt;    var r = right as ReprR&amp;lt;int&amp;gt;;// semi-safe cast&lt;br /&gt;    return new ReprR&amp;lt;int&amp;gt; { value = l.value + r.value; }; }&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Programs written against tagless interpreters are wrapped in functors in order to properly abstract over the interpreter implementation. As mentioned before, modules and signatures are effectively first-class values in this encoding, so a functor simply becomes a function:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;(* Fig. 3 *)&lt;br /&gt;module EX(S: Symantics) = struct&lt;br /&gt;  open S&lt;br /&gt;  let test1 () = app (lam (fun x -&amp;gt; x)) (bool true)&lt;br /&gt;  ...&lt;br /&gt;end&lt;/pre&gt;&lt;br /&gt;maps to:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public static class EX&lt;br /&gt;{&lt;br /&gt;  public static Repr&amp;lt;bool, B&amp;gt; Test1&amp;lt;B&amp;gt;(ISymantics&amp;lt;B&amp;gt; s)&lt;br /&gt;  {&lt;br /&gt;    return s.App(s.Lam(x =&amp;gt; x), s.Bool(true));&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The brand/ISymantics type could also be lifted to be a generic class parameter to make it syntactically closer to how it looks in the paper, but the difference is not important.&lt;br /&gt;&lt;br /&gt;You can now run EX.Test1 with any conforming implementation of ISymantics, and the type system will prevent you from mixing representations of different implementations just as it would in ML, because the brands will not match. The only way to trigger a type error due to the downcast, is if the client implements his own Repr&amp;lt;T, B&amp;gt; supplying R for the brand, then passing the custom Repr type in to a method on ISymantics&amp;lt;R&amp;gt;. In such cases the client deserves an error.&lt;br /&gt;&lt;br /&gt;I think this is a fairly reasonable trade off all things considered. Of course, it would be preferable if the CLR could just support type constructor polymorphism natively. And while all my wishes are coming true, can I have all of &lt;a href="http://stackoverflow.com/questions/411906/c-net-design-flaws/1181366#1181366"&gt;these changes&lt;/a&gt; too?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8224383235054475123?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8224383235054475123/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8224383235054475123' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8224383235054475123'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8224383235054475123'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/10/abstracting-over-type-constructors.html' title='Abstracting over Type Constructors using Dynamics in C#'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-2621154111130537739</id><published>2009-09-16T15:17:00.000-04:00</published><updated>2009-09-16T15:19:49.355-04:00</updated><title type='text'>Disabling EDGE/3G on iPhone OS 3.0</title><content type='html'>After much struggle and dead ends, &lt;a href="http://iphonenodata.com/site/"&gt;this fix&lt;/a&gt; worked like a charm. I don't think it's been publicized very much, and there are many people in need of such fixes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-2621154111130537739?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/2621154111130537739/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=2621154111130537739' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2621154111130537739'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2621154111130537739'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/09/disabling-edge3g-on-iphone-os-30.html' title='Disabling EDGE/3G on iPhone OS 3.0'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-1400198024378135720</id><published>2009-09-15T15:36:00.000-04:00</published><updated>2010-12-05T21:55:08.062-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>Sasa v0.9.2 Released</title><content type='html'>The &lt;a href="https://sourceforge.net/projects/sasa"&gt;newest Sasa release&lt;/a&gt; contains a number of bugfixes and increased compliance with MIME standards, including proper MIME MailMessage subject decoding. Perhaps of wider interest to the .NET community, there are a number of &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/tags/v0.9.2/Sasa/EventExtensions.cs?view=markup"&gt;extension methods for thread safe and null-safe event raising&lt;/a&gt;; safe event raising is a major bane of the .NET developer's existence.&lt;br /&gt;&lt;br /&gt;There are specific overloads for all System.Action delegate types. A general RaiseAny extension method is also provided that isn't very efficient, but which should match the signature of any event in the class libraries. More overloads can be provided for more efficient execution if needed.&lt;br /&gt;&lt;br /&gt;Here is the complete list of changes from the changelog:&lt;br /&gt;&lt;pre&gt;= v0.9.2 =&lt;br /&gt;&lt;br /&gt;* fixed bug where quoted printable encoding failed when = was the last&lt;br /&gt;  character.&lt;br /&gt;* added Pop3Session.Reset method.&lt;br /&gt;* compact serializer no longer uses stream positions to track cached&lt;br /&gt;  objects, so non-indexable streams, like DeflateStream, are now&lt;br /&gt;  usable.&lt;br /&gt;* ISerializing interface generalized to support serializing non-primitive&lt;br /&gt;  field values.&lt;br /&gt;* added e-mail subject decoding.&lt;br /&gt;* fixed boundary condition on QuotedPrintableEncoding.&lt;br /&gt;* added extension methods to support safely raising events.&lt;br /&gt;* added an extension method to safely generate field and property names&lt;br /&gt;  as strings.&lt;br /&gt;* added parameter to Compact serializer to indicate whether&lt;br /&gt;  SerializableAttribute should be respected.&lt;br /&gt;* added a boundary check for SliceEquals.&lt;br /&gt;* added a event Raise overload for any event handlers whose arg&lt;br /&gt;  inherits from EventArgs, which enables safe event raising within&lt;br /&gt;  an object.&lt;br /&gt;* MailAddressCollection already handles parsing comma-delimited e-mail&lt;br /&gt;  address strings, so don't attempt to split them manually.&lt;br /&gt;* added a test for an e-mail address that contains a comma.&lt;br /&gt;* default to us-ascii charset if none provided.&lt;br /&gt;* NonNull now throws a ArgumentNullException with a proper&lt;br /&gt;  description.&lt;br /&gt;* many FxCop-related fixes and CLS compliance improvements.&lt;br /&gt;* added usage restrictions on Sasa.CodeContracts attributes.&lt;/pre&gt;&lt;br /&gt;[Edit: generating HTML docs from the XML files used to be such a chore, though Sandcastle at least makes it possible, if convoluted. Mono's mdoc util is barely any help here, as it can only process one Visual Studio generated XML file at a time. I just found &lt;a href="http://immdocnet.codeplex.com/"&gt;ImmDoc.Net&lt;/a&gt; which made this process ridiculously easy and stupendously fast, easily an order of magnitude faster than Sandcastle, but it looks like there's a bug handling method ref and out parameters.&lt;br /&gt;&lt;br /&gt;In any case, &lt;a href="http://higherlogics.net/sasa/docs-v0.9.2/"&gt;here is some preliminary HTML API documentation&lt;/a&gt; for your perusal (&lt;a href="http://higherlogics.net/sasa/docs-v0.9.2/Sasa.chm"&gt;CHM file here&lt;/a&gt;). Unfortunately, Sandcastle doesn't produce a handy index.html, but hopefully the ImmDoc.Net dev will fix the above bug soon, and I can use that to generate documentation.]&lt;br /&gt;&lt;br /&gt;[Edit 2: The ImmDoc.Net dev fixed the bug, so &lt;a href="http://higherlogics.net/sasa/doc/"&gt;here is a clearer set of docs&lt;/a&gt; with a proper index. The previous link will still work for now.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-1400198024378135720?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/1400198024378135720/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=1400198024378135720' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1400198024378135720'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1400198024378135720'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/09/sasa-v092-released.html' title='Sasa v0.9.2 Released'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-5366526537320903342</id><published>2009-08-16T15:32:00.001-04:00</published><updated>2010-05-21T12:18:15.502-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Utilities'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>SlimTimer Timesheet Processing Utility</title><content type='html'>I use the very respectable &lt;a href="http://www.slimtimer.com/"&gt;SlimTimer&lt;/a&gt; to help me track my hours. Unfortunately, while they display a consistent total across all reports, the entries in each report do not necessarily add up to that total due to the fractional time units and the rounding involved.&lt;br /&gt;&lt;br /&gt;Unfortunately, the accounting department tends to frown upon inconsistencies like this, no matter the reason. My process thus far has been to simply export a full timesheet with the report settings specifying time units as precisely as possible, and then performing the sum myself on the resulting chart.&lt;br /&gt;&lt;br /&gt;This got a bit tedious, so I wrote a program to compile the necessary tallies over however many timesheet files I wanted to process. &lt;a href="http://higherlogics.net/software/timeparse-release.zip"&gt;The source and binaries are available for download&lt;/a&gt;. Simply drag and drop any number of timesheets generated from SlimTimer, and the utility will generate a new csv file for each timesheet, with an extension "-out.csv".&lt;br /&gt;&lt;br /&gt;The format of the output csv file is formatted for my invoice structure, but adjusting it for other formats is fairly trivial for anyone familiar with C#.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-5366526537320903342?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/5366526537320903342/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=5366526537320903342' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5366526537320903342'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5366526537320903342'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/08/slimtimer-timesheet-processing-utility.html' title='SlimTimer Timesheet Processing Utility'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-9201890960595109663</id><published>2009-06-23T15:36:00.000-04:00</published><updated>2011-09-26T01:58:06.953-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='web programming'/><category scheme='http://www.blogger.com/atom/ns#' term='tagless interpreters'/><category scheme='http://www.blogger.com/atom/ns#' term='EDSL'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='mobile code'/><title type='text'>Mobile Code in C# via Finally Tagless Interpreters</title><content type='html'>Awhile back &lt;a href="http://higherlogics.blogspot.com/2008/09/mostly-tagless-interpreters-in-c.html"&gt;I described an idea&lt;/a&gt; to transparently execute code server-side or client-side given the same program. I've finally gotten around to implementing this using my encoding for type constructor polymorphism in C#.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.higherlogics.net/dev/MobileTest.aspx"&gt;Here is an example you can run&lt;/a&gt; from &lt;a href="http://lambda-the-ultimate.org/node/2438"&gt;the original paper&lt;/a&gt; showcasing exponentiation which can execute transparently either server-side or client-side. The server-side code is intentionally limited to bases less than 100 and exponents less than 20.&lt;br /&gt;&lt;br /&gt;Here is some simplified code that defines an exponentiation function:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;void Build&amp;lt;B, R&amp;gt;(B _)&lt;br /&gt;    where B : ISymantics&amp;lt;B&amp;gt;, new()&lt;br /&gt;{&lt;br /&gt;    var e = _.Lambda&amp;lt;int, Func&amp;lt;int, int&amp;gt;&amp;gt;(&lt;br /&gt;            x =&amp;gt; _.Fix&amp;lt;int, int&amp;gt;(self =&amp;gt; _.Lambda&amp;lt;int, int&amp;gt;(n =&amp;gt;&lt;br /&gt;                 _.If(_.Lte(n, _.Int(0)), () =&amp;gt; _.Int(1),&lt;br /&gt;                                          () =&amp;gt; _.Mul(x, _.Apply(self, _.Add(n, _.Int(-1))))))));&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The parameter "_" is the tagless interpeter and it's type is ISymantics&amp;lt;B&amp;gt;, which defines the semantics of the language being interpreted:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// The abstract base class of a typed expression.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The type of the encapsulated expression.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="B"&amp;gt;The brand of the interpreter that can interpret this expression.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;public abstract class E&amp;lt;T, B&amp;gt; where B : ISymantics&amp;lt;B&amp;gt;, new() { }&lt;br /&gt;&lt;br /&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// The interface for interpreters.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="B"&amp;gt;The brand of the interpreter. This should be unique for each interpreter.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;public interface ISymantics&amp;lt;B&amp;gt;&lt;br /&gt;    where B : ISymantics&amp;lt;B&amp;gt;, new()&lt;br /&gt;{&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Construct an integer.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    /// &amp;lt;param name="i"&amp;gt;The integer value.&amp;lt;/param&amp;gt;&lt;br /&gt;    /// &amp;lt;returns&amp;gt;An expression encapsulating the value.&amp;lt;/returns&amp;gt;&lt;br /&gt;    E&amp;lt;int, B&amp;gt; Int(int i);&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Add two expressions.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    /// &amp;lt;param name="left"&amp;gt;The left expression.&amp;lt;/param&amp;gt;&lt;br /&gt;    /// &amp;lt;param name="right"&amp;gt;The right expression.&amp;lt;/param&amp;gt;&lt;br /&gt;    /// &amp;lt;returns&amp;gt;The sum of the two expressions&amp;lt;/returns&amp;gt;&lt;br /&gt;    E&amp;lt;int, B&amp;gt; Add(E&amp;lt;int, B&amp;gt; left, E&amp;lt;int, B&amp;gt; right);&lt;br /&gt;&lt;br /&gt;    ...&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The code-behind for the .aspx page interprets this code twice, once with a LINQ-based compiler for server-side execution, and once with a JavaScript backend to generate the client-side program.&lt;br /&gt;&lt;br /&gt;The client-side program depends only one utility function to compute the fixed point of a given function:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;function fix(f) {&lt;br /&gt;    return function self(x) { return f(self)(x); };&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Clearly any programs written this way are fairly verbose, but it's fairly straightforward to define wrapper structs around E&amp;lt;T, B&amp;gt;, such as Eint, Ebool, etc., to provide overloaded operators such that embedded programs look almost like ordinary C#. The current ISymantics interface is also limited to single-arg functions, but it's possible to add more overloads to support multi-arg functions.&lt;br /&gt;&lt;br /&gt;Of course, it's already possible to translate C# expressions to JavaScript if we write a LINQ expression visitor. However, LINQ expressions are in a sense more limited than operating on this lifted data type. For instance, it's not possible to translate a static function to JavaScript using LINQ expressions, but it is possible using the final tagless representation.&lt;br /&gt;&lt;br /&gt;It's an open question whether such final tagless interpreters can be made usable enough for real programming. However, the techniques employed here, in particular the structure which enables semi-safe type constructor polymorphism, are definitely useful for other projects that require retargetable abstractions.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://higherlogics.net/src/MobileSharp-src.zip"&gt;full source code is available&lt;/a&gt;, and includes a simple C# evaluator, LINQ-based compiler, a JavaScript backend, and an interpreter that computes the depth of the expression tree, which was also an example in the tagless paper. Also included is a partially working partial evaluator. This proved the trickiest to implement, and the only remaining problem appears to be recursive partial evaluation.&lt;br /&gt;&lt;br /&gt;As future work, the partial evaluator can be parameterized by the type of compiler, so it can partially evaluate managed code and JavaScript programs; it is currently needlessly tied to the managed code interpreter. I also plan to implement the aforementioned wrapper structs and other enhancements to improve the clarity of embedded programs, hopefully to the point where they are usable for real programming.&lt;br /&gt;&lt;br /&gt;[Edit 2009-08-16: it just came to my attention that the JS failed on the first run, but succeeded after a server-side execution was performed. This was a DOM manipulation bug, and the source and server code have been updated to fix this.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-9201890960595109663?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/9201890960595109663/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=9201890960595109663' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/9201890960595109663'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/9201890960595109663'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/06/mobile-code-in-c-via-finally-tagless.html' title='Mobile Code in C# via Finally Tagless Interpreters'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-2229915149127578552</id><published>2009-05-22T11:13:00.000-04:00</published><updated>2010-05-20T20:47:40.931-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>Sasa v0.9 Released</title><content type='html'>&lt;a href="https://sourceforge.net/projects/sasa/"&gt;Sasa v0.9&lt;/a&gt; has been released. See &lt;a href="https://sourceforge.net/project/shownotes.php?group_id=167597&amp;release_id=684422"&gt;the changelog&lt;/a&gt; for a detailed description of the changes. &lt;a href="http://higherlogics.blogspot.com/2009/02/sasa-reborn.html"&gt;Here is the original release description&lt;/a&gt; for Sasa v0.8. This post will describe only changes from v0.8.&lt;br /&gt;&lt;br /&gt;Backwards-incompatible changes:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Renamed Sasa.Collections.List to Sasa.Collections.Seq to avoid clashes with System.Collections.Generic.List&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Restructured the list operators to better support chaining&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Useful additions include:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Sasa.Weak&amp;lt;T&amp;gt; which wraps WeakReference&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Additional string processing functions, like StringExt.SliceEquals&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Array-processing combinators under Sasa.Collections.Arrays (Slice, Dup, etc.)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Stream functions under Sasa.IO (CopyTo, ToArray)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Support for MIME decoding and encoding for MailMessage parsing&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Bugfixes:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Better conformance to RFCs for Pop3Client and MailMessage parsing&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Concurrency bugfix in Sasa.Lazy.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;MIME MailMessage parsing and the Pop3Client are already in use in production code, and conformance appears adequate after hundreds of processed messages.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Experimental Additions&lt;/h2&gt;&lt;br /&gt;There are a few components of this release that I would deem "experimental".&lt;br /&gt;&lt;br /&gt;Sasa.Dynamics is intended as a "safe reflection" facility. Basically, this is a "type-case" construct as found in the "intensional type analysis" literature. Any reflective algorithm should be implementable via ITypeFunc&amp;lt;R&amp;gt;, and you cannot forget to handle a particular case. This interface basically factors out object traversal from the actual reflection algorithm.&lt;br /&gt;&lt;br /&gt;Under Sasa.Web and Sasa.Web.Asp, I've included a URL-safe Base64 encoder/decoder, Sasa.Web.Url64, and a generic ASP.NET Page base class that is immune to clickjacking and CSRF.&lt;br /&gt;&lt;br /&gt;I first proposed this idea on the &lt;a href="http://www.eros-os.org/pipermail/cap-talk/2009-April/012607.html"&gt;capability security mailing list&lt;/a&gt;. I'm not completely satisfied with the current incarnation, but it's an interesting idea I'm still toying with.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-2229915149127578552?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/2229915149127578552/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=2229915149127578552' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2229915149127578552'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2229915149127578552'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/05/sasa-v09-released.html' title='Sasa v0.9 Released'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-744249186527501500</id><published>2009-03-22T23:56:00.000-04:00</published><updated>2011-09-26T02:09:00.859-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtual machines'/><category scheme='http://www.blogger.com/atom/ns#' term='low-level programming'/><title type='text'>Garbage Collection Representations Continued II</title><content type='html'>The tagged pointer representation described in &lt;a href="http://pauillac.inria.fr/~xleroy/bibrefs/Leroy-ZINC.html"&gt;Xavier Leroy's ZINC paper&lt;/a&gt; is compelling in its simplicity. Most data manipulated in typical programs is integer or pointer-based in some way, so using a 1-bit tagged representation allows unboxed integers, resulting in an efficient representation of the most common data types.&lt;br /&gt;&lt;br /&gt;I've never been satisfied with the size header in heap-allocated types though, and the two bits reserved for GC pretty much forces you to use some mark-sweep variant. Integrating something like age-oriented collection with reference counting for the old generation would require an entirely new word for every block of memory. This is a prohibitive cost in functional programs.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Removing the Size Header&lt;/h3&gt;&lt;br /&gt;Reading about BIBOP techniques used in the &lt;a href="http://people.cs.vt.edu/~scschnei/streamflow/"&gt;Streamflow allocator&lt;/a&gt; gave me the idea of using page masking to determine the block size.&lt;br /&gt;&lt;br /&gt;As in Streamflow, consider an allocator where each page obtained from the OS was partitioned into a list of equally-sized blocks. Each of these blocks does not need its size stored in an object header, since the size of the block is a property of the page it was allocated from, and we can easily obtain the page address by simply masking the pointer to the block. So the first block of every page is dedicated to storing the size, and all subsequent blocks are placed in the free list.&lt;br /&gt;&lt;br /&gt;This works for structured data allocated in fixed-sized blocks, but unstructured data is dynamic in extent and can span pages. Unstructured data must also carry the GC header in the first word, even when spanning pages. However, we know that arrays and strings are the only primitive unstructured data described in ZINC, and they must now both carry their size information as part of the structure. We can thus easily identify "small" strings and arrays that could fit into a fixed-sized block.&lt;br /&gt;&lt;br /&gt;As a possible space optimization, such "small" arrays and strings don't need to be accompanied by a size field. We can perform a simple test to distinguish "small" arrays: if the array pointer is more than one word off from a page-aligned address, it is structured data, since unstructured data that spans pages always starts on a page address + GC header size.&lt;br /&gt;&lt;br /&gt;We now have a full 24 free bits in the block header, which we can reserve for use by the GC. 24 bits is enough to employ reference counting, or a hybrid mark-sweep for the nursery, reference counting for the old generation as in age-oriented collection. The GC to employ is now completely at the discretion of the runtime, and can even be changed while the program is running.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;The Problem&lt;/h3&gt;&lt;br /&gt;There is a downside of course. Streamflow allocates fixed-sized blocks for sizes ranging from 4 bytes up to 2kB. Considering the above approach uses the first block to store the block size, we would waste up to 2kB on the larger block sizes. We could amortize this cost by allocating more pages and spreading the cost across them, but then we lose the simple page masking. We could avoid large fixed-sized blocks, maybe cutting them off at 512 bytes, but then we would still need some way to store the size for these "medium-sized" blocks.&lt;br /&gt;&lt;br /&gt;We can't simply use a single word of the page to store the size, as we would then have an odd number of bytes to divide into fixed size blocks. Again, it's only a problem for larger block sizes. You can't split 4092 bytes into two 2kB blocks.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Another Approach&lt;/h3&gt;&lt;br /&gt;As a general solution to these problems, we can eliminate the size tag on the page entirely by maintaining a sorted array of address ranges and the block size of the range. When we need to discover the size of a block, we perform a binary search to find the address range the address falls in, and extract the size. This operation is O(log n), where n is the number of distinct address ranges, ie. sequences of consecutive pages allocated to the same block size. I would expect the number of such page regions to be under 50, so the logarithmic cost should be very low in practice, though the additional indirections accessing the array can be costly due to cache misses.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Summary&lt;/h3&gt;&lt;br /&gt;A block header representation that provides for GC independence would be compelling. The page-level size header is promising, but clearly has problems. Hopefully, some clever person will devise a general way to handle the larger fixed sized blocks without wasting too much space.&lt;br /&gt;&lt;br /&gt;The solution to the problems with the page-level size header using a binary search is a mixed blessing. The additional runtime cost in discovering block size may be prohibitive, since it must be performed during GC traversal, and on every free operation. The costs of free might possibly be amortized somehow when done in bulk, but the cost incurred by GC traversal seems much more challenging to eliminate.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-744249186527501500?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/744249186527501500/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=744249186527501500' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/744249186527501500'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/744249186527501500'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/03/garbage-collection-representations.html' title='Garbage Collection Representations Continued II'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-4553563269608097476</id><published>2009-02-24T10:08:00.001-05:00</published><updated>2011-09-26T02:17:10.976-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><category scheme='http://www.blogger.com/atom/ns#' term='LINQ'/><category scheme='http://www.blogger.com/atom/ns#' term='object oriented programming'/><title type='text'>Sasa Reborn!</title><content type='html'>My .NET &lt;a href="http://sourceforge.net/projects/Sasa"&gt;Sasa&lt;/a&gt; library has fallen by the wayside as I experimented with translating various functional idioms in my &lt;a href="http://sourceforge.net/projects/FPSharp"&gt;FP#&lt;/a&gt; library. Reading up on what a few other generic class libraries have been experimenting with, like Mono Rocks, spurred me to putting those experiments to use and updating Sasa. I significantly simplified a lot of the code, documented every class and method, and generalized as much as possible. The license is LGPL v2, and you can download the source via svn:&lt;br /&gt;&lt;pre&gt;svn co https://sasa.svn.sourceforge.net/svnroot/sasa/tags/v0.8 sasa&lt;/pre&gt;&lt;br /&gt;&lt;h3&gt;Sasa Core v0.8&lt;/h3&gt;&lt;br /&gt;A set of useful extensions to core System classes and some useful classes for high assurance development.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Named tuple types: Pair, Triple, Quad.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Either types, representing one of many possible values. There are Either types for 2, 3, and 4 parameters, mimicking the Pair, Triple, and Quad structure. Tuples are "product" types, while Either is a "sum" type, and products and sums are duals. Since products are useful, I figured variously sized sum types might also find some uses. Time will well.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Lazy type, for lazily computed values.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;An immutable list.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Various Ruby-like extensions to core types, like generators for int.UpTo, int.DownTo, string.IsNullOrEmpty, string.Slice, etc.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Useful extensions to IEnumerable.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;"Zip" functions from Haskell for anonymous types and tuple types.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;A NonNull type which decorates method parameters and ensures those parameters are not null; if NonNull is used pervasively, you can ensure that your program is free of NullReferenceExceptions.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;An Option type indicating values which may be null. Unlike System.Nullable, this works for class types.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Function currying extensions, and extensions to lift multi-parameter functions to single-parameter tupled functions&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Some convenience extensions to IDictionary.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;h3&gt;Sasa.Linq&lt;/h3&gt;&lt;br /&gt;A stand-alone assembly for Linq development.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Default IQueryProvider and IQueryable implementations&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Generic ExpressionVisitor base class.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;IdentityVisitor which provides default implementations for all NodeTypes and performs no modifications to the expression, just returning it as-is.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;ErrorVisitor which which throws NotSupportedException for all NodeTypes.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;h3&gt;Sasa.Serialization&lt;/h3&gt;&lt;br /&gt;A stand-alone assembly with serialization classes.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Provides a compact serializer which requires only ReflectionPermission, and not SecurityPermission like the System classes do; this serializer can therefore be used in medium trust environments. The serializer currently requires a little more discipline from the developer to use correctly, but space savings of 100-200% are typical.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;An experimental unsafe, highly compact binary serializer.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;h3&gt;Sasa.Net&lt;/h3&gt;&lt;br /&gt;A library providing missing functionality under System.Net.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A Pop3Client class.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;MailMessage parsing.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;h3&gt;Sasa.CodeContracts&lt;/h3&gt;&lt;br /&gt;Microsoft Research is &lt;a href="http://research.microsoft.com/en-us/projects/contracts/"&gt;developing a design by contract library&lt;/a&gt; which they hope to release with .NET 4.0. It's a fairly sophisticated piece of software, that integrates with a static verification tool called Pex. The analysis tools can detect contract violations at compile-time, and even generate test cases for each violation.&lt;br /&gt;&lt;br /&gt;Unfortunately, their license forbids commercial application of the pre-release library, even if you just want to utilize runtime contract checking.&lt;br /&gt;&lt;br /&gt;Sasa.CodeContracts is a Microsoft API-compatible implementation of the CodeContracts library. &lt;em&gt;This is only a runtime library, and does &lt;strong&gt;not&lt;/strong&gt; provide the Pex integration with static analysis and automated test generation&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;Precondition checking is enabled, but postconditions and object invariants require CIL re-writing, so they are not currently supported. I will be looking into using Mono.Cecil to rewrite the IL to support post-conditions and invariants in the future.&lt;br /&gt;&lt;h2&gt;TODO for v1.0&lt;/h2&gt;&lt;br /&gt;There are a few items remaining before v1.0 is released, but the library is usable as-is. Notably missing is MIME parsing for MailMessage, which will be added for v1.0. Also serialization will get improved safety almost on-par with standard framework serialization, and the compaction will be user-customizable for even more space savings in any given program.&lt;br /&gt;&lt;h2&gt;Future Work&lt;/h2&gt;&lt;br /&gt;The Sasa API is fully documented with accompanying XML for code completion. Comments on the clarity of the API and documentation are welcome! Some tutorials on using these features safely are coming as well.&lt;br /&gt;&lt;br /&gt;I'm dissatisfied with a few other approaches being pursued on the CLR, including:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Current approaches to parallel and concurrent programming, even Microsoft's Parallel Extensions and the Concurrency and Coordination Runtime.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;CLR security is far too coarse-grained and pretty much unusable.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Efficient async I/O is too difficult to reason about (though the CCR does make it easier).&lt;/li&gt;&lt;br /&gt;&lt;li&gt;In lieu of a Pex static analysis, there is the possibility of QuickCheck-like test suites derived from CodeContract annotations.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Keep an eye on this space for what I come up with.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-4553563269608097476?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/4553563269608097476/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=4553563269608097476' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4553563269608097476'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4553563269608097476'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/02/sasa-reborn.html' title='Sasa Reborn!'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-2507376174419726390</id><published>2009-02-03T19:07:00.000-05:00</published><updated>2011-09-26T01:55:57.982-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtual machines'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>The cost of type tests and casts in C#</title><content type='html'>Awhile back I &lt;a href="http://higherlogics.blogspot.com/2008/10/vtable-dispatching-vs-runtime-tests-and.html"&gt;ran some tests comparing the dispatching performance of runtime tests+casts against double dispatch&lt;/a&gt;. Turned out runtime type tests and casting were noticeably faster than dispatching, probably because they avoid more pipeline stalls.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://stefan.rusek.org/Posts/The-3-cast-operators-in-C/7/"&gt;Unfortunately, there is a "common wisdom"&lt;/a&gt; in the .NET world that an "is" test followed by an "as" cast is performing two casts, and in fact one should simply perform the "as" cast then check the result against null:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;// prefer this form&lt;br /&gt;string a = o as string;&lt;br /&gt;if (a != null)&lt;br /&gt;{&lt;br /&gt;  Console.WriteLine(a);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;// to this form:&lt;br /&gt;if (o is string)&lt;br /&gt;{&lt;br /&gt;  string a = o as string;&lt;br /&gt;  Console.WriteLine(a);&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;In fact, that's not the case, as any compiler worth its salt will coalesce the two tests into a single cast and branch operation. I took the tests from the above dispatching and altered them to perform the cast-and-null check, then I ran the tests again with the original is-then-as form. The latter form was about 6% faster on every timing run.&lt;br /&gt;&lt;br /&gt;There's obviously some optimization being done here, but the lesson is: don't try to outsmart the compiler. In general, just write code the safe way and let the compiler optimize it for you. It's safer to perform a test then cast within a delimited scope like an if-statement, than to let the possibly null variable float around in the outer scope where you might use it inadvertently later in the method or during refactoring.&lt;br /&gt;&lt;br /&gt;If performance turns out to be an issue, profile before trying these sorts of low-level "optimizations", because you might be surprised at the results.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-2507376174419726390?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/2507376174419726390/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=2507376174419726390' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2507376174419726390'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2507376174419726390'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/02/cost-of-type-tests-and-casts-in-c.html' title='The cost of type tests and casts in C#'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-5859203662423114661</id><published>2009-01-29T21:41:00.000-05:00</published><updated>2009-11-16T09:21:07.891-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EDSL'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='code generation'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>Embedded Stack Language for .NET - Redux</title><content type='html'>Awhile ago, &lt;a href="http://higherlogics.blogspot.com/2008/11/embedded-stack-language-for-net.html"&gt;I had posted&lt;/a&gt; about an embedding of a stack language in C#. The type signatures of the functions and the stack object encoded the consumption and production of stack values, so if your program compiled, it ran correctly.&lt;br /&gt;&lt;br /&gt;Unfortunately, the prior structure had a safety problem when generating code which I noted, but didn't have time to address.&lt;br /&gt;&lt;br /&gt;The new structure provided below does not have the safety problem, and any functions that compile are guaranteed to execute correctly. I have also altered the style to emphasize the row variable representing the "rest of the record" which the operation knows nothing about. The row variable is denoted by "_".&lt;br /&gt;&lt;br /&gt;This is still a fairly limited embedding, but I have added a few convenience functions, and may yet add more. Here is a sample program:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;var d = new DynamicMethod("test", typeof(void), null);&lt;br /&gt;var s =&lt;br /&gt;  1.Load()                   // load constant: { int }&lt;br /&gt;   .Int(2)                   // load constant: { int, int }&lt;br /&gt;   .Add()                    // add: { int, int } -&gt; { int }&lt;br /&gt;   .Do(Console.WriteLine)    // output top: { int } -&gt; { }&lt;br /&gt;   .String("Test out")       // load string: { } -&gt; { string }&lt;br /&gt;   .Do(Console.WriteLine);   // output top: { string } -&gt; { }&lt;br /&gt;s.Compile(d.GetILGenerator());// only compile empty stacks: { }&lt;br /&gt;d.Invoke(null, null);&lt;/pre&gt;&lt;br /&gt;Thankfully, C# can infer the types used so we can avoid any superfluous type annotations. The code generation functions are a little more involved however:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// Abstracts the stack structure used to hold locals, etc.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="_"&amp;gt;The rest of the stack.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;/// &amp;lt;typeparam name="T"&amp;gt;The top of the stack.&amp;lt;/typeparam&amp;gt;&lt;br /&gt;public sealed class Stack&amp;lt;_, T&amp;gt;&lt;br /&gt;{&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Defer the code generation by enclosing the opcodes&lt;br /&gt;    /// in a closure.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    internal Action&amp;lt;ILGenerator&amp;gt; gen;&lt;br /&gt;    public Stack(Action&amp;lt;ILGenerator&amp;gt; gen)&lt;br /&gt;    {&lt;br /&gt;        this.gen = gen;&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// An empty value aka void.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;public struct Unit&lt;br /&gt;{&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;/// &amp;lt;summary&amp;gt;&lt;br /&gt;/// Statically typed stack operations.&lt;br /&gt;/// &amp;lt;/summary&amp;gt;&lt;br /&gt;public static class Stack&lt;br /&gt;{&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Load an int on a new stack.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;Unit, int&amp;gt; Load(this int value)&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;Unit, int&amp;gt;(il =&amp;gt;&lt;br /&gt;                   il.Emit(OpCodes.Ldc_I4, value));&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Load a string on a new stack.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;Unit, string&amp;gt; Load(this string value)&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;Unit, string&amp;gt;(il =&amp;gt;&lt;br /&gt;                   il.Emit(OpCodes.Ldstr, value));&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Load a string on an existing stack.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;Stack&amp;lt;_, T&amp;gt;, string&amp;gt; String&amp;lt;_, T&amp;gt;(&lt;br /&gt;                  this Stack&amp;lt;_, T&amp;gt; stack, string value)&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;Stack&amp;lt;_, T&amp;gt;, string&amp;gt;(il =&amp;gt;&lt;br /&gt;        {&lt;br /&gt;            stack.gen(il);&lt;br /&gt;            il.Emit(OpCodes.Ldstr, value);&lt;br /&gt;        });&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Duplicate the top of the stack.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;Stack&amp;lt;_, T&amp;gt;, T&amp;gt; Dup&amp;lt;_, T&amp;gt;(&lt;br /&gt;                  this Stack&amp;lt;_, T&amp;gt; stack)&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;Stack&amp;lt;_, T&amp;gt;, T&amp;gt;(il =&amp;gt;&lt;br /&gt;                   il.Emit(OpCodes.Dup));&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Apply a function to the top of the stack, replacing&lt;br /&gt;    /// the top element with the return value of the function.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;_, R&amp;gt; Apply&amp;lt;_, T, R&amp;gt;(&lt;br /&gt;                  this Stack&amp;lt;_, T&amp;gt; stack, Func&amp;lt;T, R&amp;gt; target)&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;_, R&amp;gt;(il =&amp;gt;&lt;br /&gt;        {&lt;br /&gt;            stack.gen(il);&lt;br /&gt;            il.EmitCall(OpCodes.Call, target.Method, null);&lt;br /&gt;        });&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Apply an action to the top of the stack consuming&lt;br /&gt;    /// the top value.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;_, Unit&amp;gt; Do&amp;lt;_, T&amp;gt;(&lt;br /&gt;                  this Stack&amp;lt;_, T&amp;gt; stack,&lt;br /&gt;                  Action&amp;lt;T&amp;gt; target)&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;_, Unit&amp;gt;(il =&amp;gt;&lt;br /&gt;        {&lt;br /&gt;            stack.gen(il);&lt;br /&gt;            il.EmitCall(OpCodes.Call, target.Method, null);&lt;br /&gt;        });&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Load the value at the given array's index on to the stack.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;_, T&amp;gt; LoadArrayIndex&amp;lt;_, T&amp;gt;(&lt;br /&gt;                  this Stack&amp;lt;_, Stack&amp;lt;T[], int&amp;gt;&amp;gt; stack)&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;_, T&amp;gt;(il =&amp;gt;&lt;br /&gt;                   il.Emit(OpCodes.Ldelem, typeof(T)));&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Check that the top element is of type U.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;_, U&amp;gt; IsInstance&amp;lt;_, T, U&amp;gt;(&lt;br /&gt;                  this Stack&amp;lt;_, T&amp;gt; stack)&lt;br /&gt;        where T : class&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;_, U&amp;gt;(il =&amp;gt;&lt;br /&gt;        {&lt;br /&gt;            stack.gen(il);&lt;br /&gt;            il.Emit(OpCodes.Isinst, typeof(U));&lt;br /&gt;        });&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Load an int onto an existing stack.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;Stack&amp;lt;_, T&amp;gt;, int&amp;gt; Int&amp;lt;_, T&amp;gt;(&lt;br /&gt;                  this Stack&amp;lt;_, T&amp;gt; stack, int i)&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;Stack&amp;lt;_, T&amp;gt;, int&amp;gt;(il =&amp;gt;&lt;br /&gt;        {&lt;br /&gt;            stack.gen(il);&lt;br /&gt;            il.Emit(OpCodes.Ldc_I4, i);&lt;br /&gt;        });&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Add the two elements at the top of the stack.&lt;br /&gt;    /// WARNING: T must overloead the addition operator.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static Stack&amp;lt;_, T&amp;gt; Add&amp;lt;_, T&amp;gt;(&lt;br /&gt;                  this Stack&amp;lt;Stack&amp;lt;_, T&amp;gt;, T&amp;gt; stack)&lt;br /&gt;    {&lt;br /&gt;        return new Stack&amp;lt;_, T&amp;gt;(il =&amp;gt;&lt;br /&gt;        {&lt;br /&gt;            stack.gen(il);&lt;br /&gt;            il.Emit(OpCodes.Add);&lt;br /&gt;        });&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Return from a function.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static void Return&amp;lt;_, T&amp;gt;(this Stack&amp;lt;_, T&amp;gt; stack,&lt;br /&gt;                                    ILGenerator il)&lt;br /&gt;    {&lt;br /&gt;        stack.gen(il);&lt;br /&gt;        il.Emit(OpCodes.Ret);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    /// &amp;lt;summary&amp;gt;&lt;br /&gt;    /// Compile the code so long as the top of the stack&lt;br /&gt;    /// has type Unit.&lt;br /&gt;    /// &amp;lt;/summary&amp;gt;&lt;br /&gt;    public static void Compile&amp;lt;_&amp;gt;(this Stack&amp;lt;_, Unit&amp;gt; stack,&lt;br /&gt;                                  ILGenerator il)&lt;br /&gt;    {&lt;br /&gt;        stack.Return(il);&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-5859203662423114661?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/5859203662423114661/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=5859203662423114661' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5859203662423114661'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5859203662423114661'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2009/01/embedded-stack-language-for-net-redux.html' title='Embedded Stack Language for .NET - Redux'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-4469501451476431461</id><published>2008-11-27T22:41:00.000-05:00</published><updated>2011-09-26T02:11:31.806-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='reflection'/><category scheme='http://www.blogger.com/atom/ns#' term='Sasa'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><title type='text'>Compact, Declarative Serialization</title><content type='html'>&lt;a href="http://higherlogics.blogspot.com/2008/11/reflection-attributes-and.html"&gt;A few posts back&lt;/a&gt;, I hinted at using parameterization as an alternative to metadata. I've just written a serialization interface using this technique, and a binary serializer/deserializer to demonstrate the inherent tradeoffs.&lt;br /&gt;&lt;br /&gt;You can inspect &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa.Serialization/Compact/"&gt;the pair of interfaces&lt;/a&gt; required, and I implemented a &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa.Serialization/Compact/Binary/"&gt;binary serializer/deserializer pair&lt;/a&gt; as an example. &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa.Serialization/Compact/ICompactSerializable.cs?view=markup"&gt;ICompactSerializable&lt;/a&gt; is implemented for each serializable object. It's essentially a declarative method, which describes the sequential, internal structure of the object. It's simple and fast, since it provides native speed access to an object's fields without reflection, and no need for metadata.&lt;br /&gt;&lt;br /&gt;Of course, the obvious downside is that clients must describe the internal structure themselves via &lt;a href="http://sasa.svn.sourceforge.net/viewvc/sasa/trunk/Sasa.Serialization/Compact/ICompactSerializer.cs?view=markup"&gt;ICompactSerializer&lt;/a&gt;, and refactoring must be careful about reordering the sequence of calls. The upshot is that serialization and deserialization is insanely fast as compared to ordinary reflection-driven serialization, the binary is far more compact, and clients have full control over versioning and schema upgrade without additional, complicated infrastructure (surrogates, surrogate selectors, binders, deserialization callbacks, etc.).&lt;br /&gt;&lt;br /&gt;These serializers are very basic, but the principle is sound. My intention here is only to demonstrate that parameterization can often substitute for the typical approach of relying heavily on metadata and reflection. This is only one possible design of course, and other tradeoffs are possible.&lt;br /&gt;&lt;br /&gt;For instance, each method in ICompactSerializer could also accept a string naming the field, which could make the Serialize call invariant to the ordering of fields, and thus more robust against refactoring; this brings the technique much closer to the rich structural information available via reflection, but without the heavy metadata infrastructure of the .NET framework.&lt;br /&gt;&lt;br /&gt;The Serialize method of ICompactSerializable can also easily be derived by a compiler, just as the Haskell compiler can automatically derive some type class instances.&lt;br /&gt;&lt;br /&gt;This application is so basic, that the user wouldn't even need to specify the field names manually, as the compiler could insert them all automatically. Consider how much metadata, and how much slow, reflection-based code can be replaced by fast compiler-derived methods using such techniques. Projects like NHibernate wouldn't need to generate code to efficiently get and set object properties, since full native speed methods are provided by the compiler.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-4469501451476431461?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/4469501451476431461/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=4469501451476431461' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4469501451476431461'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4469501451476431461'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/11/compact-declarative-serialization.html' title='Compact, Declarative Serialization'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-7418097810694122359</id><published>2008-11-15T12:44:00.000-05:00</published><updated>2011-09-26T02:09:29.711-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='logic puzzles'/><title type='text'>The Chicken and the Egg - Redux</title><content type='html'>There is still considerable skepticism regarding &lt;a href="http://higherlogics.blogspot.com/2008/05/chicken-and-egg-inductive-analysis.html"&gt;my conclusion that the chicken comes first&lt;/a&gt;. Many of the objections are simply due to different interpretations of the question, interpretations which I consider &lt;a href="http://en.wikipedia.org/wiki/Chicken-and-egg_problem"&gt;unfaithful to the original purpose of the chicken-egg dilemma&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Fundamentally, this question is supposed to represent a causality dilemma. When there is a causal dependency between any two objects, A and B, we can ask ourselves, "which came first, A or B?" An object C could have caused B, and thus triggered the recursive relation C→B→A→B→A... Of course, it could just as easily have been A that was caused first.&lt;br /&gt;&lt;br /&gt;To properly answer this question, we must reduce the abstract objects A and B to concrete objects and apply our scientific knowledge to ground the recursion. Using chickens and eggs as our objects, we have to precisely define what chickens and what eggs to consider.&lt;br /&gt;&lt;br /&gt;The question "which came first, chickens or any type of egg?" is not a causal dilemma at all, and further is not faithful to the original purpose of the question. The ancient Greeks that first posed this question had no concept of evolution and no inkling that chickens could have any relationship to other egg types, so they would not have asked, "which came first, the chicken or the fish egg?" To the ancient Greeks, such a question isn't a dilemma, it's complete nonsense. Thus, the paper linked in my last post is invalid.&lt;br /&gt;&lt;br /&gt;The more precise and faithful phrasing of the dilemma is, "which came first, the chicken or the chicken egg?", or more generally, "which came first, species S&lt;sub&gt;n&lt;/sub&gt; or it's reproductive mechanism R&lt;sub&gt;n&lt;/sub&gt;?"&lt;br /&gt;&lt;br /&gt;S&lt;sub&gt;n&lt;/sub&gt; and R&lt;sub&gt;n&lt;/sub&gt; are in the appropriate recursive relation to form a causal dilemma, and we can ground it in biology and chemistry. The very first single-celled organism did not form by mitosis, thus the first single-celled organism, S&lt;sub&gt;∅&lt;/sub&gt; preceded its own reproduction mechanism, R&lt;sub&gt;∅&lt;/sub&gt;. Thus, the dominoes fall, ie. the first egg-laying species was not hatched from an egg, thus it too preceded its own reproductive mechanism, and so on, ad infinitum.&lt;br /&gt;&lt;br /&gt;Eventually, we reach chickens and chicken eggs, and the conclusion is simply, that the chicken necessarily came before its egg.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-7418097810694122359?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/7418097810694122359/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=7418097810694122359' title='29 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7418097810694122359'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7418097810694122359'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/11/chicken-and-egg-redux.html' title='The Chicken and the Egg - Redux'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>29</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-7520562012295578975</id><published>2008-11-07T12:28:00.000-05:00</published><updated>2011-09-26T02:07:14.441-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EDSL'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><category scheme='http://www.blogger.com/atom/ns#' term='code generation'/><title type='text'>Embedded Stack Language for .NET</title><content type='html'>Parametric polymorphism is a powerful tool for constructing and composing safe programs. To demonstrate this, I've constructed a tiny embedded stack language in C#, and I exploited C# generics, aka parametric polymorphism, to ensure that any embedded programs are type-safe, and thus, "can't go wrong".&lt;br /&gt;Moreover, this language is jitted, since I use ILGenerator of System.Reflection.Emit, and the complexity of doing this was no higher than creating an interpreter. This is mainly because the CLR is itself a stack-based VM, and so jitting a stack-based program to a stack-based IL is fairly natural.&lt;br /&gt;&lt;br /&gt;Here is the type-safe representation of the stack:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public struct Stack&amp;lt;R, T&amp;gt; : IDisposable&lt;br /&gt;{&lt;br /&gt;    internal ILGenerator gen;&lt;br /&gt;    public Stack(ILGenerator gen)&lt;br /&gt;    {&lt;br /&gt;        this.gen = gen;&lt;br /&gt;    }&lt;br /&gt;    public void Dispose()&lt;br /&gt;    {&lt;br /&gt;        this.Return();&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;This type encapsulates the IL output stream and uses two phantom type variables to encode the values at the top of the execution stack. The T parameter represents the top of the stack, and the R parameter represents the rest of the stack.&lt;br /&gt;&lt;br /&gt;A few more types are needed to safely represent other reflected objects used during code generation:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;// Represents the 'void' type used in some functions&lt;br /&gt;public struct Unit&lt;br /&gt;{&lt;br /&gt;}&lt;br /&gt;// Represents a named function with a known type structure T&amp;rarr;R.&lt;br /&gt;// This could be a void&amp;rarr;void, an int&amp;rarr;string, etc.&lt;br /&gt;public struct Function&amp;lt;T, R&amp;gt;&lt;br /&gt;{&lt;br /&gt;    internal MethodInfo method;&lt;br /&gt;    public Function(Action&amp;lt;T&amp;gt; f)&lt;br /&gt;    {&lt;br /&gt;        this.method = f.Method;&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Here's a test program demonstrating the use:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;&lt;br /&gt;class Program&lt;br /&gt;{&lt;br /&gt;    static void Main(string[] args)&lt;br /&gt;    {&lt;br /&gt;        var d = new DynamicMethod("test", typeof(void), null);&lt;br /&gt;        using (var s = new Stack&amp;lt;Unit, Unit&amp;gt;(d.GetILGenerator()))&lt;br /&gt;        {   // equivalent to: () =&gt; Console.WriteLine(1 + 2);&lt;br /&gt;            s.Int(1)&lt;br /&gt;            .Int(2)&lt;br /&gt;            .Add()&lt;br /&gt;            .Apply(new Function&amp;lt;int, Unit&amp;gt;(Console.WriteLine));&lt;br /&gt;        }&lt;br /&gt;        d.Invoke(null, null);&lt;br /&gt;        Console.ReadLine();&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Next I define a set of extension methods operating on typed stack objects, all parameterized by the underlying types. You can see how the top elements of the stack are specified, and how each operation consumes these elements or otherwise modifies the stack:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public static class CodeGen&lt;br /&gt;{&lt;br /&gt;    // Apply a function to its arguments; note that function args and arity&lt;br /&gt;    // is checked at compile-time&lt;br /&gt;    public static Stack&amp;lt;R, R0&amp;gt;&lt;br /&gt;      Apply&amp;lt;R, T, R0&amp;gt;(this Stack&amp;lt;R, T&amp;gt; stack, Function&amp;lt;T, R0&amp;gt; target)&lt;br /&gt;    {&lt;br /&gt;        stack.gen.EmitCall(OpCodes.Call, target.method, null);&lt;br /&gt;        return new Stack&amp;lt;R, R0&amp;gt;(stack.gen);&lt;br /&gt;    }&lt;br /&gt;    // Embed an literal integer in the IL stream&lt;br /&gt;    public static Stack&amp;lt;Stack&amp;lt;R, T&amp;gt;, int&amp;gt;&lt;br /&gt;      Int&amp;lt;R, T&amp;gt;(this Stack&amp;lt;R, T&amp;gt; stack, int i)&lt;br /&gt;    {&lt;br /&gt;        stack.gen.Emit(OpCodes.Ldc_I4, i);&lt;br /&gt;        return new Stack&amp;lt;Stack&amp;lt;R, T&amp;gt;, int&amp;gt;(stack.gen);&lt;br /&gt;    }&lt;br /&gt;    // Add the two integers at the top of the execution stack&lt;br /&gt;    public static Stack&amp;lt;R, T&amp;gt;&lt;br /&gt;      Add&amp;lt;R, T&amp;gt;(this Stack&amp;lt;Stack&amp;lt;R, T&amp;gt;, T&amp;gt; stack)&lt;br /&gt;    {&lt;br /&gt;        stack.gen.Emit(OpCodes.Add);&lt;br /&gt;        return new Stack&amp;lt;R, T&amp;gt;(stack.gen);&lt;br /&gt;    }&lt;br /&gt;    // Return from the current function&lt;br /&gt;    public static Stack&amp;lt;R, T&amp;gt;&lt;br /&gt;      Return&amp;lt;R, T&amp;gt;(this Stack&amp;lt;R, T&amp;gt; stack)&lt;br /&gt;    {&lt;br /&gt;        stack.gen.Emit(OpCodes.Ret);&lt;br /&gt;        return new Stack&amp;lt;R, T&amp;gt;(stack.gen);&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;This is obviously a fairly limited set of functionality, and it would require a great deal more work and thought to properly abstract the full execution of CIL (especially exceptions!), but it just might be possible. A large subset is certainly possible.&lt;br /&gt;&lt;br /&gt;It might serve as an interesting alternative to System.Linq.Expressions for some code generation applications, since Linq expressions are largely untyped. Here are a few more opcode examples for reference purposes:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;&lt;br /&gt;    // Duplicate the top element&lt;br /&gt;    public static Stack&amp;lt;Stack&amp;lt;R, T&amp;gt;, T&amp;gt;&lt;br /&gt;      Dup&amp;lt;R, T&amp;gt;(this Stack&amp;lt;R, T&amp;gt; stack)&lt;br /&gt;    {&lt;br /&gt;        stack.gen.Emit(OpCodes.Dup);&lt;br /&gt;        return new Stack&amp;lt;Stack&amp;lt;R, T&amp;gt;, T&amp;gt;(stack.gen);&lt;br /&gt;    }    // index into an array&lt;br /&gt;    public static Stack&amp;lt;R, T&amp;gt;&lt;br /&gt;      LoadArrayIndex&amp;lt;R, T&amp;gt;(this Stack&amp;lt;R, Stack&amp;lt;T[], int&amp;gt;&amp;gt; stack)&lt;br /&gt;    {&lt;br /&gt;        stack.gen.Emit(OpCodes.Ldelem, typeof(T));&lt;br /&gt;        return new Stack&amp;lt;R, T&amp;gt;(stack.gen);&lt;br /&gt;    }&lt;br /&gt;    // runtime type test&lt;br /&gt;    public static Stack&amp;lt;R, T&amp;gt;&lt;br /&gt;      IsOfType&amp;lt;R, T, T&amp;gt;(this Stack&amp;lt;R, T&amp;gt; stack)&lt;br /&gt;        where T : class&lt;br /&gt;    {&lt;br /&gt;        stack.gen.Emit(OpCodes.Isinst, typeof(T));&lt;br /&gt;        return new Stack&amp;lt;R, T&amp;gt;(stack.gen);&lt;br /&gt;    }&lt;/pre&gt;&lt;br /&gt;Pasting all of the code in this post into a file and compiling it should just work.&lt;br /&gt;&lt;br /&gt;Wrapping all CIL opcodes this way could be quite useful, since you're guaranteed to have a working program if your code type checks. If you have a language, embedded or otherwise, using this sort of technique might be a good way to achieve a type-safe JIT for .NET, assuming you can assign a stack-based execution semantics. Since you're compiling to .NET, it will need exactly such a stack semantics anyway!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-7520562012295578975?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/7520562012295578975/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=7520562012295578975' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7520562012295578975'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7520562012295578975'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/11/embedded-stack-language-for-net.html' title='Embedded Stack Language for .NET'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-3420064704279382558</id><published>2008-11-04T10:26:00.000-05:00</published><updated>2011-09-26T02:12:56.880-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='reflection'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>Reflection, Attributes and Parameterization</title><content type='html'>I used to be a big fan of reflection, and C#'s attributes also looked like a significant enhancement. Attributes provide a declarative way to attach metadata to fields, methods, and classes, and this metadata is often used during reflection.&lt;br /&gt;&lt;br /&gt;The more I learned about functional programming, type systems, and so on, the more I came to realize that reflection isn't all it's cracked up to be. Consider .NET serialization. You can annotate fields you don't want serialized with the attribute [field:NonSerialized].&lt;br /&gt;&lt;br /&gt;However, metadata is just data, and every usage of attributes can be replaced with a pair of interfaces. Using [field:NonSerialized] as an example, we can translate this class:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;class Foo {&lt;br /&gt;  [field:NonSerialized]&lt;br /&gt;  object bar;&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Into one like this:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;// these two interfaces take the place of a NonSerializableAttribute declaration&lt;br /&gt;interface INonSerialized {&lt;br /&gt;  void Field&amp;lt;T&amp;gt;(ref T field);&lt;br /&gt;}&lt;br /&gt;interface IUnserializableMembers {&lt;br /&gt;  void Unserializable(INonSerialized s);&lt;br /&gt;}&lt;br /&gt;class Foo : IUnserializableMembers {&lt;br /&gt;  object bar;&lt;br /&gt;  void Unserializabe(INonSerializable s) {&lt;br /&gt;    s.Field(ref bar);&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Essentially, we are replacing reflection and metadata with parameterization, which is always safer. This structure is also much more efficient than reflection and attribute checking.&lt;br /&gt;&lt;br /&gt;Consider another example, the pinnacle of reflection, O/R mapping. Mapping declarations are often specified in separate files and even in a whole other language (often XML or attributes), which means they don't benefit from static typing. However, using the translation from above, we can obtain the following strongly type-checked mapping specification:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;// specifies the relationships of objects fields to table fields&lt;br /&gt;interface IRelation&lt;br /&gt;{&lt;br /&gt;    // 'name' specifies the field name in the table&lt;br /&gt;    void Key&amp;lt;T&amp;gt;(ref T id, string name);&lt;br /&gt;    void Field&amp;lt;T&amp;gt;(ref T value, string name);&lt;br /&gt;    void Foreign&amp;lt;T&amp;gt;(ref T fk, string name);&lt;br /&gt;    void ForeignInverse&amp;lt;T&amp;gt;(ref T fk, string foreignName);&lt;br /&gt;    void List&amp;lt;T&amp;gt;(ref IList&amp;lt;T&amp;gt; list);&lt;br /&gt;    void List&amp;lt;T&amp;gt;(ref IList&amp;lt;T&amp;gt; list, string orderBy);&lt;br /&gt;    void Map&amp;lt;K, T&amp;gt;(ref IDictionary&amp;lt;K, T&amp;gt; dict);&lt;br /&gt;}&lt;br /&gt;// declares an object as having a mapping to an underlying table&lt;br /&gt;interface IPersistent&lt;br /&gt;{&lt;br /&gt;    void Map(IRelation f);&lt;br /&gt;}&lt;br /&gt;// how to use the above two interfaces&lt;br /&gt;class Bar: IPersistent&lt;br /&gt;{&lt;br /&gt;    int id;&lt;br /&gt;    string foo;&lt;br /&gt;    public void Map(IRelation f)&lt;br /&gt;    {&lt;br /&gt;        f.Key(ref id, "Id");&lt;br /&gt;        f.Field(ref foo, "Foo");&lt;br /&gt;    }&lt;br /&gt;    public string Foo&lt;br /&gt;    {&lt;br /&gt;        get { return foo; }&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;There are in general two implementors of IRelation: hydration, when the object is loaded from the database and the object's fields are populated, and write-back, when the object is being written back to the database. The IRelation interface is general enough to support both use-cases because IRelation accepts references to the object's fields.&lt;br /&gt;&lt;br /&gt;This specification of the mappings is more concise than XML mappings, and is strongly type-checked at compile-time. The disadvantage is obviously that the domain objects are exposed to mapping, but this would be the case with attributes anyway. &lt;br /&gt;&lt;br /&gt;Using XML allows one to cleanly separate mappings from domain objects, but I'm coming to believe that this isn't necessarily a good thing. I think it's more important to ensure that the mapping specification is concise and declarative.&lt;br /&gt;&lt;br /&gt;Ultimately, any use of attributes in C# for reflection purposes can be replaced by the use of a pair of interfaces without losing the declarative benefits.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-3420064704279382558?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/3420064704279382558/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=3420064704279382558' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/3420064704279382558'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/3420064704279382558'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/11/reflection-attributes-and.html' title='Reflection, Attributes and Parameterization'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8265187577659748107</id><published>2008-11-03T16:36:00.000-05:00</published><updated>2011-09-26T02:08:27.648-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtual machines'/><category scheme='http://www.blogger.com/atom/ns#' term='low-level programming'/><title type='text'>Garbage Collection Representations Continued</title><content type='html'>In a &lt;a href="http://higherlogics.blogspot.com/2008/07/garbage-collection-representations.html"&gt;previous post&lt;/a&gt;, I discussed representations used for polymorphic parameters. These special data representations are a contract between the language and the runtime which enables the runtime to traverse data structures to collect garbage.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;64-Bit Polymorphic Representation&lt;/h4&gt;&lt;br /&gt;I had neglected to mention this interesting variant of tagged representations that I had come across at one point, but which I believe is not in widespread use. This representation &lt;a href="http://csg.csail.mit.edu/pubs/memos/Memo-396/memo-396.pdf"&gt;uses full 64-bit floating point representations for all polymorphic values&lt;/a&gt;, and encodes non-floating point data in the unused bits of NaN values. IEEE 754 floating point numbers reserve 12 of the 64 bits for flagging NaN values, so the remaining bits are free to use for encoding integers, pointers, and so on.&lt;br /&gt;&lt;br /&gt;I haven't found very much empirical data on this approach, but it looks promising. The polymorphic representation is doubled in size, but the structure is unboxed, so the polymorphic function call overhead may be slightly higher, but garbage collector pressure is reduced since less boxing is needed. Also, this representation can be used to ensure that all values are properly 8-byte aligned, which is often an efficiency gain in modern memory systems. This also has the advantage of permitting unboxed floating point values, which is a significant slowdown for the other tagged representations.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Natural Representations with Bitmasks&lt;/h4&gt;&lt;br /&gt;A new representation I recently came across is being used in the &lt;a href="http://www.pllab.riec.tohoku.ac.jp/smlsharp/"&gt;SML#&lt;/a&gt; language. Consider a typical tagged representation where 1-bit is co-opted to distinguish integers from pointers. Each polymorphic parameter must carry this 1 bit.&lt;br /&gt;&lt;br /&gt;But there is no reason for this 1 bit to be contained within the parameter itself. Consider &lt;a href="http://www.pllab.riec.tohoku.ac.jp/~ohori/research/NguyenOhoriPPDP06.pdf"&gt;lifting this 1 bit to an extra function parameter&lt;/a&gt;, so:&lt;br /&gt;&lt;pre&gt;val f: 'a &amp;rarr; 'b &amp;rarr; 'c&lt;/pre&gt;&lt;br /&gt;becomes:&lt;br /&gt;&lt;pre&gt;val f: 'a * Bool &amp;rarr; 'b * Bool &amp;rarr; 'c * Bool&lt;/pre&gt;&lt;br /&gt;The Bool is the flag indicating whether the parameter is a pointer or an integer. But using a full function parameter to hold a single bit is a waste, so let's coalesce all the Bool parameters into a single Bitmask:&lt;br /&gt;&lt;pre&gt;val f: Bitmask &amp;rarr; 'a &amp;rarr; 'b &amp;rarr; 'c&lt;/pre&gt;&lt;br /&gt;Bitmask can be an arbitrarily long sequence of bits, and each bit in the mask corresponds to the bit that would normally be in the pointer/integer representation of the corresponding parameter in a typical tagged representation. By lifting the tag bit to an external parameter, we can now use full natural representations for all parameters.&lt;br /&gt;&lt;br /&gt;Calling a polymorphic function now consists of masking and shifting to extract the bit for the given parameter, or assigning one if calling from a monomorphic function.&lt;br /&gt;&lt;br /&gt;Bitmasks must also accompany polymorphic data structures. The presence of any polymorphic type variable implies the requirement for a bitmask. It's also not clear how expensive this shifting is in relation to inline tag bits. SML# does not yet have a native code compiler, so any comparisons to other SML compilers aren't representative.&lt;br /&gt;&lt;br /&gt;However, the bitmask representation does not unbox floating point numbers, but a hybrid scheme with the above 64-bit representation is possible.&lt;br /&gt;&lt;br /&gt;Consider the lifted 1-bit to indicate whether the parameter is word-sized (1), or larger (0). Larger parameters require a full 64-bit representation, while word-sized parameters do not. The garbage collector thus skips any parameters of bitmask tag value 1 (since they are unboxed integers), and analyzes any values of bitmask tag value 0 as a 64-bit value. If the 64-bit is a NaN, the value is further analyzed to extract a single distinguished bit indicating whether it is a pointer. If it is not a pointer, it is skipped (since it is an unboxed floating point value). If it is a pointer, the structure pointed to is traversed.&lt;br /&gt;&lt;br /&gt;This last scheme provides natural representations for integers and floating point numbers, and expands the representation of pointers in polymorphic functions by one word. Other tradeoffs are certainly possible.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8265187577659748107?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8265187577659748107/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8265187577659748107' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8265187577659748107'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8265187577659748107'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/11/garbage-collection-representations.html' title='Garbage Collection Representations Continued'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-2160578317313313556</id><published>2008-10-21T13:12:00.000-04:00</published><updated>2008-10-21T13:16:40.691-04:00</updated><title type='text'>SysCache build for NHibernate 2.0.1GA</title><content type='html'>NHibernate 2.0.1GA is the latest binary download available, but it seems the NHibernate.Caches.SysCache binary release is lagging behind, as &lt;a href="https://sourceforge.net/project/showfiles.php?group_id=216446"&gt;the download at Sourceforge&lt;/a&gt; was built against NHibernate 2.0.0. Here's a version of &lt;a href="http://higherlogics.com/NHibernate.Caches.SysCache.zip"&gt;NHibernate.Caches.SysCache&lt;/a&gt; built against 2.0.1GA.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-2160578317313313556?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/2160578317313313556/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=2160578317313313556' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2160578317313313556'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2160578317313313556'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/10/syscache-build-for-nhibernate-201ga.html' title='SysCache build for NHibernate 2.0.1GA'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-7031836241155684995</id><published>2008-10-20T14:22:00.000-04:00</published><updated>2011-09-26T01:45:13.934-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtual machines'/><category scheme='http://www.blogger.com/atom/ns#' term='pattern matching'/><title type='text'>Dispatching: VTables vs. Runtime Tests and Casts</title><content type='html'>Object oriented languages like C# and Java use method dispatching techniques to select a concrete method to invoke when given a polymorphic type. By far the most common technique is the &lt;a href="http://en.wikipedia.org/wiki/Virtual_method_table"&gt;virtual method table&lt;/a&gt;, or vtable for short.&lt;br /&gt;&lt;br /&gt;Long ago I had updated the above Wikipedia page with information regarding alternative dispatch techniques. Research published a few years back indicated that dispatching through a vtable incurred astonishing overheads, as high as 50% of total execution time. Alternative dispatch techniques based on runtime tests, such as a linear sequence of if statements checking for the various concrete class types, or a sequence of nested if statements forming a binary search, were often more efficient on a variety of hardware.&lt;br /&gt;&lt;br /&gt;Vtables are rather flexible however, and the composition of a number of vtables can encode a variety of advanced dispatching behaviour. The .NET CLR utilizes vtables, and I've sketched out rough encodings of first-class functions, algebraic data types and pattern matching, higher-ranked polymorphism, and even a form of first-class messages, all by encoding them as virtual method dispatches.&lt;br /&gt;&lt;br /&gt;Unfortunately, research indicated that vtable dispatch is quite expensive, and many of these idioms require at least one and often two virtual dispatches. Runtime tests and casts might be usable here instead, but a call site compiled for a fixed hierarchy of classes cannot be extended as easily as vtable-based dispatching. The only exception is a JIT using a technique called "polymorphic inline caches", which neither the CLR nor JVM utilize.&lt;br /&gt;&lt;br /&gt;So I set about to measure the dispatch overhead on the CLR. &lt;a href="http://higherlogics.net/tests/DispRtt-src.zip"&gt;Here's the source&lt;/a&gt; for my test, which consists of 3 classes in a hierarchy, each of which provide a visitor for double-dispatching, and a single method for the single virtual dispatch test. The driver code performs the runtime tests and casts at the dispatch site for the last case.&lt;br /&gt;&lt;br /&gt;The numbers are total CPU ticks for the entire test, so lower numbers are better. It's a purely CPU-bound test, and I ran it on a Core 2 Duo 2.13GHz. I used .NET's high resolution timer, and I threw away the highest and lowest numbers:&lt;br /&gt;&lt;br /&gt;&lt;table style="width: 100%;"&gt;&lt;tr&gt;&lt;th&gt;Double Dispatch&lt;/th&gt;&lt;th&gt;Runtime Test&lt;/th&gt;&lt;th&gt;Single Dispatch&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1658222872&lt;/td&gt;&lt;td&gt;1092469328&lt;/td&gt;&lt;td&gt;1136358792&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1659474960&lt;/td&gt;&lt;td&gt;1092917368&lt;/td&gt;&lt;td&gt;1136520160&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1659485472&lt;/td&gt;&lt;td&gt;1093638752&lt;/td&gt;&lt;td&gt;1136745536&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1659866480&lt;/td&gt;&lt;td&gt;1094546584&lt;/td&gt;&lt;td&gt;1136985448&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1660287768&lt;/td&gt;&lt;td&gt;1094639088&lt;/td&gt;&lt;td&gt;1137321136&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1660488312&lt;/td&gt;&lt;td&gt;1094689040&lt;/td&gt;&lt;td&gt;1137526400&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1660662648&lt;/td&gt;&lt;td&gt;1094856216&lt;/td&gt;&lt;td&gt;1137994576&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1662110088&lt;/td&gt;&lt;td&gt;1095503256&lt;/td&gt;&lt;td&gt;1138300032&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1666248296&lt;/td&gt;&lt;td&gt;1095734848&lt;/td&gt;&lt;td&gt;1138461264&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1667377456&lt;/td&gt;&lt;td&gt;1095989528&lt;/td&gt;&lt;td&gt;1138609112&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1672127744&lt;/td&gt;&lt;td&gt;1097126880&lt;/td&gt;&lt;td&gt;1141472360&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="3"&gt;Avg:&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1662395645.09&lt;/td&gt;&lt;td&gt;1094737353.45&lt;/td&gt;&lt;td&gt;1137844983.27&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;As you can see, runtime tests and casts are surprisingly efficient, beating out even single dispatch. Any functional language targeting the CLR will definitely want to use runtime tests to implement pattern matching. I might try a deeper hierarchy at some point, but research has shown that &lt;a href="http://www.haskell.org/~simonmar/papers/ptr-tagging.pdf"&gt;up to 95% of algebraic types have 3 cases or less&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Unfortunately, when the complete set of concrete types is not known, a vtable dispatch is unavoidable, unless one resorts to runtime code generation to recompile the dispatch site. This is essentially a polymorphic inline cache, and it requires more sophisticated runtime infrastructure than the CLR natively supports, though it is possible if the language abstracts program execution away from how the CLR natively works.&lt;br /&gt;&lt;br /&gt;So in summary, vtables are nice and simple, but rather inefficient on current hardware.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-7031836241155684995?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/7031836241155684995/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=7031836241155684995' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7031836241155684995'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7031836241155684995'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/10/vtable-dispatching-vs-runtime-tests-and.html' title='Dispatching: VTables vs. Runtime Tests and Casts'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-7023721495966416410</id><published>2008-09-12T10:50:00.000-04:00</published><updated>2011-09-26T02:13:49.037-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='tagless interpreters'/><category scheme='http://www.blogger.com/atom/ns#' term='EDSL'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>Tagless Interpreters in C#</title><content type='html'>Creating DSLs is all the rage these days, and for good reason. Most abstractions are actually a little language struggling to get out. Design consists of creating abstractions with maximum power, and minimum restrictions, and reusing this abstraction as much as possible. Small domain-specific languages are the ticket.&lt;br /&gt;&lt;br /&gt;However, language implementations are often written in fairly ad-hoc ways, and most interpreters are difficult to extend to compilers and partial evaluators. Languages are usually described by an "initial algebra", which basically just means that language terms are described by data types. Here's a simple definition of a language with integers, variables and addition:&lt;br /&gt;&lt;br /&gt;&lt;pre class="brush: csharp"&gt;(*&lt;br /&gt; * Expressions := [0-9]* | e1 + e2 | e1 - e2&lt;br /&gt; *)&lt;br /&gt;type expression = Int of int | Add of expression * expression | Sub of expression * expression&lt;/pre&gt;&lt;br /&gt;So a program in this language can consist of integers, subtraction or addition expressions. The interpreter for this language unpacks the expression by checking the tags, then evaluates the result by dispatching to the appropriate handler.&lt;br /&gt;&lt;pre class="brush: csharp"&gt;let eval exp env = match exp with&lt;br /&gt;  | Int(i) -&gt; i&lt;br /&gt;  | Sub(e1,e2) -&gt; (eval e1) - (eval e2)&lt;br /&gt;  | Add(e1,e2) -&gt; (eval e1) + (eval e2);;&lt;/pre&gt;&lt;br /&gt;This is fine for simple languages, but it's not very efficient since tag decode and dispatch can expensive. Eliminating tags via program analysis, like partial evaluation, yields a simple compiler of sorts, thus yielding efficient DSLs.&lt;br /&gt;&lt;br /&gt;Furthermore, we would like to exploit the type checking of the host language to ensure the type safety of the embedded language's interpreter, and type safety of any expressions of the embedded language. Unfortunately, more sophisticated languages written using this structure make it difficult to ensure that all the cases in the language are properly handled, which then requires the use of runtime errors.&lt;br /&gt;&lt;br /&gt;By far the simplest solution to all of these problems that I've come across, is a paper by Jacques Carette, Oleg Kiselyov, and Chung-chieh Shan, &lt;a href="http://lambda-the-ultimate.org/node/2438"&gt;Finally Tagless, Partially Evaluated, Tagless Staged Interpreters for Simpler Typed Languages.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If I understand the terminology correctly, they use a final algebra instead of an initial algebra to describe a language. In other words, instead of using data types to describe language expressions, they use functions. The language above is described using four functions provided by a module:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;module type L = sig&lt;br /&gt;  type exp&lt;br /&gt;  val Int : int -&gt; exp&lt;br /&gt;  val Add : exp -&gt; exp -&gt; exp&lt;br /&gt;  val Sub : exp -&gt; exp -&gt; exp&lt;br /&gt;end;;&lt;/pre&gt;&lt;br /&gt;An interpreter for this language is trivial:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;module I : L = struct&lt;br /&gt;  let Int i = i&lt;br /&gt;  let Add i1 i2 = i1 + i2&lt;br /&gt;  let Sub i1 i2 = i1 -i2&lt;br /&gt;end;;&lt;/pre&gt;&lt;br /&gt;Constructing an expression of this little language from within the host language is simple:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;let test l:L = &lt;br /&gt;  let i = l.Int 1 in&lt;br /&gt;    let sum = l.Add i (l.Int 2);; (* sum is now 3 *)&lt;/pre&gt;&lt;br /&gt;This embedded program abstracts over the type of the "interpreter" used to evaluate it. In other words, if we build a compiler backend for the module type L, we can use it to evaluate this term. A compiler is overkill for this simple language, but the more sophisticated the embedded language, the more attractive a compiler looks.&lt;br /&gt;&lt;br /&gt;This toy arithmetic language just illustrates the main ideas, but the language described in the paper is actually a full-fledged programming language, with if-statements, first-class functions, etc. The authors then extend their tagless interpreter into a full fledged compiler using MetaOCaml's staging facilities, and then further extend the compiler into a partial evaluator. There are also more sections on abstracting evaluation order of the embedded language, ie. making the embedded language lazy even if the host language is strict, and self-interpretation (ie. metacircular interpretation).&lt;br /&gt;&lt;br /&gt;It's a very exciting result, but as Oleg describes in the LtU thread linked to above, the abstraction over "interpreter" types L, requires the host language to abstract over type constructors. As &lt;a href="http://higherlogics.blogspot.com/2008/01/almost-type-safe-general-monad-in-c-aka.html"&gt;I explained before&lt;/a&gt;, C# cannot do this. In order to implement a program which abstracts over type constructors in some way, we have to resort to dynamics. In other words, we have to use casts.&lt;br /&gt;&lt;br /&gt;I have devised three techniques to structure these interpreters to be mostly safe. The tradeoffs are slightly different in each, but in all cases, the interpreter and all embedded language terms are type-safe as long as the client doesn't try any hijinks. If they do, the resulting runtime errors are a perfectly acceptable compromise IMO. The embedded language I describe is also a full-fledged programming language with if-statements, lambdas, and so on, and all examples are written in C#.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://lambda-the-ultimate.org/node/2569#comment-43377"&gt;first implementation&lt;/a&gt; uses a ML-module-to-C# translation I described in a previous post on this blog. It's actually rather complicated, and you should probably skip over it unless you're masochistic enough to want to delve into nitty gritty details. This encoding is the safest of the three however, as I believe there is no way to trigger a runtime exception except by resorting to reflection.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://lambda-the-ultimate.org/node/2569#comment-43805"&gt;second implementation&lt;/a&gt; uses a much more natural functional-style encoding, where the language is described by an interface, and an expression is an extensible data type Exp that must be implemented by an interpreter. All Exp types are "branded" by the type of the interpreter, so it's not possible to accidentally mix expressions from different interpreters.&lt;br /&gt;&lt;br /&gt;However, because Exp is extensible, this leaves the interpreter open to client injections, where a client can trigger a runtime exception if they subclass Exp themselves, and inject such an Exp value into a program term. This is only a problem if a trusted component evaluates program terms built by untrusted clients. The trusted component must still be aware that runtime errors are possible if clients can be malicious, which seems like a perfectly acceptable compromise for the simplicity of this solution.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://lambda-the-ultimate.org/node/2569#comment-43807"&gt;third implementation&lt;/a&gt; is a more object-oriented style encoding, reminiscent of Smalltalk. Each core type is now its own class, and operations on those types are methods on that class. For instance, an if-statement is the If method on the Bool class.&lt;br /&gt;&lt;br /&gt;The language interface consists only of constructors for the core language types. This approach is more verbose than the functional-style implementation, but it permits a more natural embedding of program terms in C#, because we can now use operators and ordinary method calls to operate on program terms. Like the second implementation, this too can throw runtime errors based on client-injected Exp types, so the same caveats apply.&lt;br /&gt;&lt;br /&gt;It's not yet clear which of the last two implementations is preferable, but I prefer the functional-style for its brevity if you're writing a real interpreter and/or compiler for a language. Functional-style programs excel at symbol manipulation. If you're constructing embedded programs from within C# only, then the last implementation is probably preferable, since the availability of operators can make program terms less verbose.&lt;br /&gt;&lt;br /&gt;All three implementations abstract over the type of "interpreter" for program terms. It could be a straight-laced interpreter, or it could be a code generator backend (like System.Runtime.Emit). For instance, in the post for the third implementation, I briefly describe a JavaScript-generating backend for an embedded language I'm playing with. It would allow an embedded program to transparently execute server-side via an interpreter or client-side via JavaScript depending on a client's capabilities.&lt;br /&gt;&lt;br /&gt;Consider the possible uses for transparently displaying AJAX pages server-side for thin HTML-only clients, and client side for real browsers. In other words, you write this code only once. Witness the power of DSLs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-7023721495966416410?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/7023721495966416410/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=7023721495966416410' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7023721495966416410'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7023721495966416410'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/09/mostly-tagless-interpreters-in-c.html' title='Tagless Interpreters in C#'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-2871681624726648118</id><published>2008-08-18T10:36:00.000-04:00</published><updated>2011-09-26T01:54:04.997-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='web programming'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='mobile code'/><title type='text'>Mobile Continuations for the Web</title><content type='html'>A continuation is basically the state of a program at the time the continuation was captured. For a common example, consider a blocking system call for any operating system. If the system call blocks the process, the continuation of that process is placed on a blocked queue until the system call completes. When the operation completes, the continuation is removed from the queue and resumed with the result of the system call.&lt;br /&gt;&lt;br /&gt;There has been plenty of well-deserved &lt;a href="http://www.seaside.st/"&gt;hoopla&lt;/a&gt; around &lt;a href="http://okmij.org/ftp/Computation/Continuations.html#shift-cgi"&gt;continuations for web interaction&lt;/a&gt;. Navigating a website is a great deal like operating on a program's continuations. A page generated from a server-side program contains references to a number of program continuations in its links and forms. Clicking a link or submitting a form invokes the captured continuation which the server then executes. The result is yet another set of continuations on the resulting page, ad infinitum.&lt;br /&gt;&lt;br /&gt;Unfortunately, most continuation frameworks are designed around server-side continuations, where the continuations are saved on the server, and the client receives only a secure token naming the continuation. The &lt;a href="http://waterken.sourceforge.net/"&gt;Waterken server&lt;/a&gt; is a particularly beautiful example of this for Java. Awhile  ago &lt;a href="https://www2897.ssldomain.com/higherlogics/www/Wiki.ashx/About"&gt;I ported an older version of the Waterken server to ASP.NET&lt;/a&gt;. Here is a public continuation for the &lt;a href="https://www2897.ssldomain.com/higherlogics/www/Wiki.ashx/Comments"&gt;comments&lt;/a&gt; page, and embedded in the comments page is the &lt;a href="https://www2897.ssldomain.com/higherlogics/www/Wiki.ashx/?key=54vu-qn4f-mqja"&gt;editing authority&lt;/a&gt;, which is an unguessable capability URL for the continuation to edit the comments page.&lt;br /&gt;&lt;br /&gt;The Waterken "web-calculus" is actually a full blown persistent, distributed object system with a binding for the web.&lt;br /&gt;&lt;br /&gt;However, maintaining persistent server-side state significantly complicates many aspects of web systems. It makes load balancing servers difficult, and persistent continuations are single threaded so only one invocation can proceed at a time which hurts scalability. Furthermore, if the continuations are persistent, as they are in Waterken which provides orthogonal persistence, then we must also contend with schema upgrade.&lt;br /&gt;&lt;br /&gt;Two recent articles prompted me to re-examine these issues and devise a solution. First, Tommy McGuire posted a link to his &lt;a href="http://www.crsr.net/Software/InfernalDevice.html"&gt;"Infernal Device"&lt;/a&gt; paper on LtU. His paper describes a set of structures which can be used to develop an application by reifying the application's states as transitions in a finite state machine. The Transition objects are essentially application-specific partial continuations. The paper also details all of the different storage available in a browser, and how it can be exploited by a server to save state on the client.&lt;br /&gt;&lt;br /&gt;For page-local state, the browser URL can hold up to 2kB on IE (other browsers can store more), and we have embedded form fields for POST requests. For site-local state, we have cookies which can be in the multi-kB size.&lt;br /&gt;&lt;br /&gt;Then I came across &lt;a href="http://gemstonesoup.wordpress.com/2008/08/07/making-_k-and-_s-optional-a-seaside-heresy/"&gt;this blog post&lt;/a&gt; about scaling Seaside by reducing the need to capture server-side state. The best way to optimize expensive operations is by avoiding them completely!&lt;br /&gt;&lt;br /&gt;However, much program state &lt;em&gt;does not need&lt;/em&gt; to reside on the server. We can capture a limited set of program state in the browser itself. In essence, we serialize the program continuation in the browser's URL or in its form fields. We also encrypt the serialized form to maintain integrity.&lt;br /&gt;&lt;br /&gt;And so our program continuations have become mobile objects that migrate between client and server and can survive server restarts as long as any server state the continuations access also survives restart. A set of load balanced machines need only share the encryption keys, and user requests seamlessly migrate between them.&lt;br /&gt;&lt;br /&gt;I've created a small library for mobile continuations written in C#. I also provide an IHttpHandler for use with ASP.NET. A web application using this library doesn't need to use any web controls or .aspx pages. An application is structured as a set of continuations which implement two operations: Show, and Apply. Show outputs the continuation via display-independent set of interfaces. Apply is invoked when a continuation receives a POST request. It takes a NameValueCollection containing all of the form's fields. This takes the place of a continuation's named arguments. I could implement a more elaborate system as found in the Waterken server where the fields are implicitly typed and named, but I felt a more limited scope is preferable for the time being.&lt;br /&gt;&lt;br /&gt;By convention, continuations in the research literature are generally named "k", so the continuation interface is named K. &lt;a href="https://www2897.ssldomain.com/higherlogics/cont/K.ashx"&gt;This simple HelloWorld class&lt;/a&gt; implements the K interface and it embeds a continuation to an Adder object in its display:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;    public struct HelloWorld : K&lt;br /&gt;    {&lt;br /&gt;        public void Show(IWindow v)&lt;br /&gt;        {&lt;br /&gt;            v.Title("Hello world!");&lt;br /&gt;            v.Body(this);&lt;br /&gt;        }&lt;br /&gt;&lt;br /&gt;        public void Show(IBody v)&lt;br /&gt;        {&lt;br /&gt;            v.Const("Hello", "Hello world!");&lt;br /&gt;            v.Promise("Adder", new Adder());&lt;br /&gt;        }&lt;br /&gt;&lt;br /&gt;        public void Show(IInput v)&lt;br /&gt;        {&lt;br /&gt;        }&lt;br /&gt;&lt;br /&gt;        public K Apply(NameValueCollection args)&lt;br /&gt;        {&lt;br /&gt;            return this;&lt;br /&gt;        }&lt;br /&gt;    }&lt;/pre&gt;&lt;br /&gt;The &lt;a href="https://www2897.ssldomain.com/higherlogics/cont/K.ashx?k=0QfkLqRasB/3xiE2lTqbdEGIO0LGjrNqt7Ai3KLHzn/PVXBH/M/sA2jCPYQGBRGI"&gt;Adder class&lt;/a&gt; is equally simple, consisting of a text input and a submit button. This continuation builds a form which submits to itself. When applying the continuation, it adds its currently stored value to the value specified in the form and returns a new instance of Adder with the stored value.&lt;br /&gt;&lt;pre class="brush: csharp"&gt;    public struct Adder : K&lt;br /&gt;    {&lt;br /&gt;        int n;&lt;br /&gt;        string e;&lt;br /&gt;        public Adder(int n)&lt;br /&gt;        {&lt;br /&gt;            this.n = n;&lt;br /&gt;            this.e = "";&lt;br /&gt;        }&lt;br /&gt;&lt;br /&gt;        public void Show(IWindow v)&lt;br /&gt;        {&lt;br /&gt;            v.Title("Adder!");&lt;br /&gt;            v.Body(this);&lt;br /&gt;        }&lt;br /&gt;&lt;br /&gt;        public void Show(IBody v)&lt;br /&gt;        {&lt;br /&gt;            if (!string.IsNullOrEmpty(e))&lt;br /&gt;            {&lt;br /&gt;                v.Const("error", "Error: " + e);&lt;br /&gt;            }&lt;br /&gt;            v.Literal(n);&lt;br /&gt;            v.Literal(" + ");&lt;br /&gt;            v.K("y", "add", this);&lt;br /&gt;        }&lt;br /&gt;&lt;br /&gt;        public void Show(IInput v)&lt;br /&gt;        {&lt;br /&gt;            v.Text("x", "", "");&lt;br /&gt;        }&lt;br /&gt;&lt;br /&gt;        public K Apply(NameValueCollection args)&lt;br /&gt;        {&lt;br /&gt;            try&lt;br /&gt;            {&lt;br /&gt;                return new Adder(this.n + int.Parse(args["x"]));&lt;br /&gt;            }&lt;br /&gt;            catch&lt;br /&gt;            {&lt;br /&gt;                this.e = "Invalid number specified.";&lt;br /&gt;                return this;&lt;br /&gt;            }&lt;br /&gt;        }&lt;br /&gt;    }&lt;/pre&gt;The library also provides a compact object serializer. HelloWorld serializes to 48 bytes. Adding some simple fields barely increases the size at all. You probably won't be entering any of these URLs by hand of course. There are a few more optimizations that can be made to the serializer as well. CrystalTech hosts ASP.NET in a medium trust environment, so the library also works for shared hosts. Some contortions were necessary, and fully general object serialization is not yet available (for instance, serializing delegates will fail).&lt;br /&gt;&lt;br /&gt;By default, the library assumes all continuation objects can be migrated to clients. This encourages more scalable program structures. If some object needs to remain server-side, simply implement the ILocal interface. There are no operations, but the interface marks the class as non-mobile. The default behaviour in this case is to generate a random 64-bit token, and migrate this token instead. When the token migrates back to the server, it re-connects to the original object. If the server restarts, this state is lost, but a state server as found in ASP.NET can be built which will allow the state to survive restarts and be shared between machines.&lt;br /&gt;&lt;br /&gt;Schema upgrade can still pose a problem unfortunately, and unlike with server-side state where the continuations are stored centrally, you can't use a tool to upgrade them. Altering method implementations does not pose an upgrade problem. Adding and removing fields is problematic however.&lt;br /&gt;&lt;br /&gt;Fortunately, applications using the library are already structured for easy upgrade and permit a configurable policy for deprecation. Basically, don't touch existing classes, and only create new classes. Replace the use of old classes in the program with the new classes, then whenever you judge it appropriate, remove the old class entirely. On any serialization errors, the library will invoke an application-specific handler which returns a continuation. I think any conceivable upgrade policy is feasible given these primitives.&lt;br /&gt;&lt;br /&gt;The current source for the library is &lt;a href="http://higherlogics.net/src/DUI-src.zip"&gt;available here&lt;/a&gt;. I plan to improve the mechanism for displaying continuations in a medium-independent fashion. Ideally, I'd like to be able to create a binding for Windows.Forms, and for JSON requests, so that the same continuation-structured application can be reused unmodified in different environments.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-2871681624726648118?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/2871681624726648118/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=2871681624726648118' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2871681624726648118'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2871681624726648118'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/08/mobile-continuations-for-web.html' title='Mobile Continuations for the Web'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-137732533251197148</id><published>2008-07-28T13:26:00.000-04:00</published><updated>2011-09-26T02:08:40.976-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtual machines'/><category scheme='http://www.blogger.com/atom/ns#' term='low-level programming'/><title type='text'>Garbage Collection Representations</title><content type='html'>I have yet to find a comprehensive description of the various data representations  used for garbage collection. This isn't an overview of garbage collection algorithms, but an overview of how to distinguish pointers from other values and how to traverse the data structures, which has seemingly gotten only a short riff in the literature. I aim to give an in-depth overview of the techniques I've come across or managed to puzzle out on my own.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;em&gt;Boxing:&lt;/em&gt; all values are allocated on the heap, so basically everything is a pointer, thus, there is no ambiguity between pointers and values.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;Tagged pointers:&lt;/em&gt; pointers are distinguished from ordinary values by some sort of tag. This information is often encoded in the pointer or value itself for a very compact representation.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;Type-passing:&lt;/em&gt; each parameter of a function is associated with a set of operations, usually referred to as a "dictionary", which is similar to an interface in object-oriented languages. This dictionary defines a collect() operation for each type, which will traverse the structure looking for garbage. This dictionary can be passed into every function, or it can be reconstructed from the stack.&lt;/li&gt;&lt;/ol&gt;&lt;h3&gt;Why Are Special Representations Needed?&lt;/h3&gt;&lt;br /&gt;One might first ask why special representations are even needed. At compile-time, it's generally perfectly clear what types are being used, as well as the types of the various data structures.&lt;br /&gt;&lt;br /&gt;While this is true of simple C-like languages, it isn't true of &lt;em&gt;polymorphic&lt;/em&gt; languages. C is what's known as &lt;em&gt;monomorphic&lt;/em&gt;, meaning all parameters have a fixed, known type at compile-time.&lt;br /&gt;&lt;br /&gt;By contrast, polymorphic languages can have function parameters that can accept any type, and this type is not fixed at any time. A simple example is the identity function:&lt;br /&gt;&lt;pre&gt;val id: 'a -&gt; 'a&lt;/pre&gt;This function accepts a parameter of any type, and returns a value of the same type. But since you can't know this type ahead of time, you need some way to encode the type in a running program so the garbage collector has enough information to traverse it in its search for garbage.&lt;br /&gt;&lt;br /&gt;This problem only really crops up with the combination of polymorphism and separate compilation, meaning you compile a program to a single machine code file which can then be distributed without prior knowledge of how it will be instantiated and used. The .NET CLR is an example of a platform which supports polymorphism, but &lt;em&gt;not&lt;/em&gt; separate compilation, since it uses the JIT to specialize methods that use generics. A specialized method is GC'd differently depending on whether it's instantiated with a reference type or a value type. In separately compiled languages, this is not true, hence the need for a special representation.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Boxing&lt;/h3&gt;&lt;br /&gt;All boxed values are allocated in the heap, including integers and floats. This is the most straightforward representation, and the least efficient of the options both in terms of space and time.&lt;br /&gt;&lt;br /&gt;The extra space use comes from the additional pointer needed to indirect to an int, and the additional allocator headers required to track it.&lt;br /&gt;&lt;br /&gt;The additional runtime overhead is incurred when repeatedly indirecting the pointers to access the values, and the additional garbage collector pressure from having so many small values.&lt;br /&gt;&lt;br /&gt;Allocating many small types like this could also lead to significant memory fragmentation, which balloons memory use, and induces poor locality of reference.&lt;br /&gt;&lt;br /&gt;A boxed version of the identity function would look like this:&lt;br /&gt;&lt;pre&gt;void * id(void * v) {&lt;br /&gt;  return v;&lt;br /&gt;}&lt;/pre&gt;Basically, we don't know anything about the value involved, and the void pointer just points to a heap location where the value is stored.&lt;br /&gt;&lt;br /&gt;There is a single flag stored in the allocator headers to indicate whether this is a compound structure or a value. An int is a value, and a (int,int) pair is a compound. All compounds consist of pointers, so the collector just recursively traverses all of the pointers in the structure, and halts when it reaches values. For compounds, the size of the structure must also be encoded in the header.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Tagged Pointers&lt;/h3&gt;&lt;br /&gt;Tagged pointers have &lt;a href="http://citeseer.ist.psu.edu/306876.html"&gt;a long tradition in functional languages&lt;/a&gt;. Typically, all of the tag bits are encoded within pointers and values themselves, so the representation is very compact. This is important for functional languages in particular, since they encourage the allocation of many small objects.&lt;br /&gt;&lt;br /&gt;Modern memory allocation algorithms generally segregate free space in variously sized buckets. The smallest bucket is generally the size of a pointer, or perhaps twice the size of a pointer. This is generally 4 bytes on a 32-bit machine, so at the very least, the pointer will always point to an address aligned on a 4-byte boundary.&lt;br /&gt;&lt;br /&gt;However, most pointers are byte-addressable, meaning they can point to addresses that are aligned on a single byte boundary. This means that the lowest two bits of a pointer are generally wasted space, since they will always be 0. So let's use that wasted space for something!&lt;br /&gt;&lt;br /&gt;Let's say we use the lowest bit to differentiate between pointers and integers on the stack and in data structures. This means an int will no longer be 32-bits, but 31-bits, with 1 bit going to this flag. If the lowest bit is 0, it's an int, and if the low order bit is 1, it's a pointer. This means we can scan a stack frame or a data structure one word at a time and easily distinguish integers from pointers by just checking the lowest bit.&lt;br /&gt;&lt;br /&gt;So now we can have fully unboxed integers, which saves a lot of time and space in programs since we no longer need to allocate them on the heap. Anything larger than a word must still be heap-allocated.&lt;br /&gt;&lt;br /&gt;Field offset calculations now require us to omit the lowest bit, which involves subtracting 1 from any offset. Since most offsets are statically known, this costs nothing at runtime.&lt;br /&gt;&lt;br /&gt;There is one remaining problem with the above representation, namely, we still don't know the size of a structure! When the GC traverses a structure, it will iterate right off of the end since it doesn't know when to stop.&lt;br /&gt;&lt;br /&gt;The paper I linked to above places the size in a header word present in every structure. This header word is split into multiple fields. On a 32-bit machine, the lowest 22 bits are used for the size of the structure in words, the next two bits are used by the garbage collector (typically a mark-sweep), and the last 8 bits are used for the constructor tag, ie. Nil or Cons. The constructor tag is typically present in every data structure, but using a whole word for it is somewhat wasteful, so this more compact representation is quite useful.&lt;br /&gt;&lt;br /&gt;This representation may also &lt;a href="http://lambda-the-ultimate.org/node/2699#comment-43040"&gt;permit some interesting optimizations&lt;/a&gt;. Since word-sized values are unboxed, all heap allocated structures will be at least 8-bytes in size, consisting of the header word followed by a number of data words (a commenter pointed out that this doesn't necessarily imply 8-byte alignment however; while true, as mentioned above allocators segregate memory into variously sized buckets, and so we need to simply ensure that these buckets are multiples of 8-bytes). This means the third lowest order bit is also unused, giving us two free bits in every pointer.&lt;br /&gt;&lt;br /&gt;Consider the following type declaration:&lt;br /&gt;&lt;pre&gt;type 'a Foo = Foo1 | Foo2 of int | Foo3 of 'a | Foo4 of 'a * 'a&lt;/pre&gt;Since Foo1 does not contain any embedded values or pointers, we don't need a pointer for it at all and it can just be represented by a simple unboxed integer.&lt;br /&gt;&lt;br /&gt;Foo2 contains an integer, so we must allocate a heap structure to contain this case.&lt;br /&gt;&lt;br /&gt;Foo3 and Foo4 are the interesting cases. 'a is a polymorphic value, and if instantiated with a pointer type, we don't need to allocate a heap structure for it. Basically, we can use the two free bits to encode up to 3 constructors for such values, and remove a level of indirection. Pattern matching on such structures is broken down into 3 cases:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;If it's an int, then it's one of the empty constructors.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;If it's a pointer, and the second and third lowest order bits are clear, extract the constructor tag from the header, and the value is just after the header.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Else, extract the constructor tag from the pointer, and the pointer is a pointer to the value.&lt;/li&gt;&lt;/ol&gt;However, to maintain separate compilation, we cannot assume that 'a is a pointer type, as it could be an int. Therefore only Foo4 can benefit from this inlining optimization in this particular case.&lt;br /&gt;&lt;br /&gt;Coming back to the question of identifying structure sizes, an alternative representation which I have not seen described anywhere, is to use the unused third lowest bit as a mark bit to terminate GC scanning, and use the second lowest bit to distinguish structures with nested pointers, and those that are pointer-free.&lt;br /&gt;&lt;br /&gt;A '1' in the second lowest order bit position flags a "value structure" which contains no pointers. In other words, it consists entirely of integer-sized data, so the GC does not need to scan this structure. Alternately, a '0' in this position indicates a "compound structure" which contains pointers, and which must be scanned by the GC.&lt;br /&gt;&lt;br /&gt;A '1' in the third lowest order bit position flags the last pointer in a data structure. This bit is only set in pointers within data structures, and is clear everywhere else.&lt;br /&gt;&lt;br /&gt;Note that only the last &lt;em&gt;pointer&lt;/em&gt; is used for the halt condition, since the collector cares only about pointers and ignores unboxed values. The compiler can optimize garbage collection by placing all pointers at the beginning of a structure. Also, when extracting a pointer from a structure or writing one into a structure, it should be masked to clear or set the appropriate bits.&lt;br /&gt;&lt;br /&gt;Using this representation, we free the 22-bits used for the structure size in the header. We can thus reserve a full 24 bits for the GC, which expands our choice of GC algorithms considerably. Even ref-counting becomes feasible with 24-bits in the header.&lt;br /&gt;&lt;br /&gt;One possible downside of this representation is that the size header was also used for array lengths. The length must now be stored in its own field in the structure, so we've effectively grown the array type by one word. Considering the size of arrays compared to a single word, this seems like an acceptable overhead for this representation's uniformity and flexibility.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Type-Passing&lt;/h3&gt;&lt;br /&gt;Type-passing, aka &lt;a href="http://citeseer.ist.psu.edu/goldberg92polymorphic.html"&gt;tagless GC&lt;/a&gt;, is the most sophisticated GC technique. There are two approaches that I'm aware of:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;em&gt;Dictionary-passing:&lt;/em&gt; a dictionary consisting of functions specific to the type is passed along with all values of that type. These functions know how to collect values of that type.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;Type Reconstruction:&lt;/em&gt; basically, the concrete type of a polymorphic function is reconstructed at runtime by inferring the type from previous activation frames. Monomorphic functions have all the type information needed to fully specify any polymorphic parameters, so we just search the stack for the type information stored there.&lt;/li&gt;&lt;/ol&gt;The benefits are that pointers are just pointers, integers are full-sized integers, and no special meaning is assigned to their values or any bits contained within the word.&lt;br /&gt;&lt;br /&gt;To my knowledge, only two languages have used this technique, &lt;a href="http://citeseer.ist.psu.edu/32465.html"&gt;Id&lt;/a&gt; and &lt;a href="http://citeseer.ist.psu.edu/henderson02accurate.html"&gt;Mercury&lt;/a&gt;, and only Mercury is still available as far as I know. The .NET CLR's technique can also be considered a form of type-passing.&lt;br /&gt;&lt;br /&gt;The downsides of dictionary-passing are that every polymorphic parameter is paired with a dictionary parameter, thus increasing the parameter lists of all polymorphic functions. The additional pushing and popping can outweigh any advantages that might be gained from dictionary-passing.&lt;br /&gt;&lt;br /&gt;Type-reconstruction can remove the need to pass this parameter at the expense of additional runtime overhead. Basically, we don't need to pass this dictionary to every function call, we just need to get it paired with the parameter in the monomorphic function which calls the first polymorphic function. As we walk the stack we can then extract the dictionary for a given parameter from the last monomorphic function. This becomes more complicated with closures, since the dictionaries for polymorphic parameters must now be saved in the closure environment as well.&lt;br /&gt;&lt;br /&gt;This process is essentially a form of type inference performed at runtime. Unfortunately, type inference often has pathological corner cases with exponential running time. These cases are unlikely, but inference in general has substantial overhead (see the Id paper).&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Acknowledgments&lt;/h3&gt;&lt;br /&gt;Many thanks to Andreas Rossberg, of &lt;a href="http://www.ps.uni-sb.de/alice/"&gt;Alice ML fame&lt;/a&gt;, for &lt;a href="http://lambda-the-ultimate.org/node/2699#comment-40506"&gt;patiently wading through my sometimes confused ramblings&lt;/a&gt;. :-)&lt;br /&gt;&lt;br /&gt;[Edit: GHC's recent &lt;a href="http://research.microsoft.com/~simonpj/papers/ptr-tag/index.htm"&gt;pointer tagging paper&lt;/a&gt; actually discusses the constructor inlining optimization I describe above, along with some hard numbers to quantify how many constructors can be eliminated. Turns out that &gt;95% of constructors contain 3 cases or less, so the technique could amount to a substantial savings. Theoretically, it  seems that every second nested data type could be unboxed in this fashion.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-137732533251197148?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/137732533251197148/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=137732533251197148' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/137732533251197148'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/137732533251197148'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/07/garbage-collection-representations.html' title='Garbage Collection Representations'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8588717654291980761</id><published>2008-07-18T10:38:00.000-04:00</published><updated>2011-09-26T01:53:08.368-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='C'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><category scheme='http://www.blogger.com/atom/ns#' term='concurrency'/><title type='text'>Coroutines in C Redux</title><content type='html'>&lt;a href="http://higherlogics.blogspot.com/2008/07/coroutines-in-c.html"&gt;My last post&lt;/a&gt; generated a bit of discussion here and &lt;a href="http://www.reddit.com/info/6s5tt/comments/"&gt;on Reddit&lt;/a&gt;, where after further testing, I discovered that stack management didn't work properly on Windows. It would seem that embedded frame pointers on the stack are used to ensure that we are on the same stack, at least in DEBUG mode, and so growing or shrinking the stack generated cryptic errors.&lt;br /&gt;&lt;br /&gt;These problems motivated me to explore alternate methods of implementing coroutines. Fundamentally, the current state of a computation is encapsulated in the stack. So switching between two concurrent computations means modifying the stack, either by swapping out the old stack for an entirely new one, or by copying the relevant portion of the stack to a save area, and restoring it when switching back. Coroutines and threads typically use stack switching, and continuations have traditionally used some sort of copying.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://code.google.com/p/libconcurrency/source/browse/trunk"&gt;initial coroutine implementation&lt;/a&gt; used stack switching, which is fairly efficient incurring only a register save, a pointer update, and a register restore. Stack copying involves a register save, a copy to save area, a copy from a save area, and a register restore. The additional copies incur a certain overhead.&lt;br /&gt;&lt;br /&gt;The benefits of copying are that it's virtually guaranteed to work across all platforms, and that all stack data are restored to their original locations so the full semantics of C are available. Stack switching restricts certain C operations, such as taking the address of a local variable.&lt;br /&gt;&lt;br /&gt;My &lt;a href="http://code.google.com/p/libconcurrency/source/browse/branches/copying"&gt;naive copying implementation&lt;/a&gt; was about two orders of magnitude slower than stack switching. It performed about 300,000 context switches/sec on my Core 2 Duo, vs stack switching at ~12,500,000 context switches/sec. I reduced this overhead by an order of magnitude &lt;a href="http://code.google.com/p/libconcurrency/source/browse/branches/copying-cache-stacks"&gt;by caching stacks&lt;/a&gt; (~1,500,000 context switches/sec), so part of the stack save area is wasted, but at least we're not allocating and freeing on each context switch.&lt;br /&gt;&lt;br /&gt;The copying implementation can be optimized further using lazy stack copying. Basically, only a portion of the stack is restored, and an underflow handler is set up as the return address of the last stack frame. The underflow handler then restores the next portion of the stack and resumes the computation. If the underflow handler is never invoked, then we only need to copy that small portion of the stack back to the save area, so this ends up being a huge win.&lt;br /&gt;&lt;br /&gt;Unfortunately, while establishing an underflow handler is trivial to do in assembly, I haven't been able to figure out how to accomplish this in C. It would be a neat trick though, and I'd estimate that it would bring the performance of a copying implementation very close to a stack switching implementation.&lt;br /&gt;&lt;br /&gt;I've also started playing with an implementation of &lt;a href="http://code.google.com/p/libconcurrency/source/browse/branches/delimited-cont"&gt;delimited continuations&lt;/a&gt; for C, since this has the potential to reduce the amount of copying needed. It's tricky to do in C though, so I don't have anything solid yet.&lt;br /&gt;&lt;br /&gt;For expediency, I have also implemented coroutines using Windows Fibers. Unfortunately, Fibers don't support any sort of copy or clone operation, so I had to retire coro_clone for the time being. This means my coroutine library can no longer be used to implement multishot continuations. Of course, this capability is trivial in a stack copying implementation. Hopefully I'll be able to figure out a way to improve the performance of copying coroutines/continuations in a portable way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8588717654291980761?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8588717654291980761/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8588717654291980761' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8588717654291980761'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8588717654291980761'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/07/coroutines-in-c-redux.html' title='Coroutines in C Redux'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8961789776228184357</id><published>2008-07-16T20:00:00.001-04:00</published><updated>2011-09-26T02:06:01.148-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='C'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><category scheme='http://www.blogger.com/atom/ns#' term='concurrency'/><title type='text'>Coroutines in C</title><content type='html'>I've just uploaded a functional coroutine library for C, called &lt;a href="http://code.google.com/p/libconcurrency/"&gt;libconcurrency&lt;/a&gt;. It's available under the LGPL. I think it's the most complete, flexible and simplest coroutine implementation I've seen, so hopefully it will find some use.&lt;br /&gt;&lt;br /&gt;Next on the todo list are some more rigourous tests of the corner cases, and then extending libconcurrency to scale across CPUs. This will make it the C equivalent of &lt;a href="http://manticore.cs.uchicago.edu/"&gt;Manticore&lt;/a&gt; for ML.&lt;br /&gt;&lt;br /&gt;There is a rich opportunity for scalable concurrency in C. Of course, I only built this library to serve as a core component of a virtual machine I'm building, and that's all I'm going to say about that. ;-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8961789776228184357?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8961789776228184357/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8961789776228184357' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8961789776228184357'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8961789776228184357'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/07/coroutines-in-c.html' title='Coroutines in C'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-2417886234792074831</id><published>2008-05-26T19:18:00.000-04:00</published><updated>2011-09-26T02:09:40.431-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='logic puzzles'/><title type='text'>The Chicken and the Egg - An Inductive Analysis</title><content type='html'>In a slight departure from my usual computer science focused musings, I'm going to analyze another logical conundrum that has raged for centuries. Which came first, the chicken or the egg?&lt;br /&gt;&lt;br /&gt;One of my many online debates clued me into the fact that there is a widespread belief that the egg came first. I even found &lt;a href="http://radicalpedagogy.icaap.org/content/issue5_2/04_garner.html"&gt;a "paper"&lt;/a&gt; providing an in-depth analysis concluding the same. Unfortunately, the analysis appears flawed. An e-mail to the author bounced, so I figured I might as well post my brief analysis here.&lt;br /&gt;&lt;br /&gt;A simple inductive analysis suffices to conclusively determine the causal chain.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Base case:&lt;/em&gt; single celled organism, asexual reproduction via mitosis (given our current knowledge)&lt;br /&gt;&lt;em&gt;n implies n+1:&lt;/em&gt; species n produces, by its own reproduction mechanism R&lt;sub&gt;n&lt;/sub&gt;, an offspring n+1 with a slightly different reproduction mechanism, R&lt;sub&gt;n+1&lt;/sub&gt;.&lt;br /&gt;&lt;em&gt;Conclusion:&lt;/em&gt; the "chicken" came first.&lt;br /&gt;&lt;br /&gt;In this case, our "chickens" were not hatched from "chicken eggs", but were instead hatched from the "chicken's" progenitor's egg type. The authors of the paper attempted to disregard this semantic interpretation under "Chicken Precision", but as this is a metaphor, a "chicken" is merely a stand-in for a particular species to be substituted at will.&lt;br /&gt;&lt;br /&gt;Thus, the only universally quantifiable proposition is that the "chicken" came first, since the first egg-bearing species, S&lt;sub&gt;n+1&lt;/sub&gt; and R&lt;sub&gt;n+1&lt;/sub&gt;, was produced from some non-egg-based reproduction mechanism, R&lt;sub&gt;n&lt;/sub&gt;. The contrary proposition that the egg came first contradicts the base case of the above inductive argument where we know reproduction was based on asexual mitosis.&lt;br /&gt;&lt;br /&gt;Unless our understanding of early biology changes radically, metaphorically and literally speaking, "chickens" came first.&lt;br /&gt;&lt;br /&gt;[Edit: I &lt;a href="http://higherlogics.blogspot.com/2008/11/chicken-and-egg-redux.html"&gt;posted an update to elaborate&lt;/a&gt; on why I believe my interpretation is more faithful to the original intent of the question, as first formulated in ancient Greece.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-2417886234792074831?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/2417886234792074831/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=2417886234792074831' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2417886234792074831'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2417886234792074831'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/05/chicken-and-egg-inductive-analysis.html' title='The Chicken and the Egg - An Inductive Analysis'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-3705754907225124429</id><published>2008-04-17T15:07:00.000-04:00</published><updated>2011-09-26T01:59:25.709-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='NHibernate'/><category scheme='http://www.blogger.com/atom/ns#' term='object oriented programming'/><category scheme='http://www.blogger.com/atom/ns#' term='relational programming'/><title type='text'>Object/Relational Mapping and Factories</title><content type='html'>My day job is has stuck me with C#, but I try to make the best of it, as evidenced by my FP# and Sasa C# libraries.&lt;br /&gt;&lt;br /&gt;One thing that still gets in my way more than it should is O/R mapping. No other mapper I've come across encourages a true object-oriented application structure. Granted, I've only really used NHibernate, and I had built my own mapper before that was even available, but I've read up quite a bit on the the other mappers.&lt;br /&gt;&lt;br /&gt;By true OO structure, I mean that all application objects are only constructed from other application objects, which doesn't involve dependencies on environment-specific code (ie. if you're running under ASP.NET, Windows forms, Swing, etc.). A pure structure encourages a proper separation between core application code, and display and controller code, which allows more flexible application evolution.&lt;br /&gt;&lt;br /&gt;Instead, controller logic often manually constructs application objects, passing in default arguments to properly initialize the required fields. This means constructor and initialization code must be duplicated when running in another environment, or tedious refactoring is needed when changing the constructor interface. Further, the defaults are hardcoded in the code, which means changes in defaults require an application upgrade.&lt;br /&gt;&lt;br /&gt;Instead, O/R mappers should promote a factory pattern for constructing application objects. Factories themselves are constructed when the application is initialized, and are henceforth singletons within a given application instance. O/R mappers don't support or encourage factories or singletons in this manner however, as they always map a key/identifier, to an object instance. Factories are slightly different as they are generally singletons.&lt;br /&gt;&lt;br /&gt;For example, let's assume we have a simple Product class:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public abstract class Product&lt;br /&gt;{&lt;br /&gt;  int productId;&lt;br /&gt;  decimal price;&lt;br /&gt;  protected Product()&lt;br /&gt;  {&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;Now we have here a public constructor which requires a Quote object to initialize the base Product object. You can't sell 'abstract' products, so we need a concrete product, like a Table:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public class Table : Product&lt;br /&gt;{&lt;br /&gt;  int length;&lt;br /&gt;  int width;&lt;br /&gt;  public Table(int length, int width) : base()&lt;br /&gt;  {&lt;br /&gt;    this.length = length;&lt;br /&gt;    this.width = width;&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;Of course, a Table with dimensions of 0'x0' is invalid, so we need to ensure that a Table is initialized with a proper length and width. We can pass in a pair of default dimensions when constructing a Table instance in a controller, but chances are the default values will be the same everytime you construct an instance of Table. So why duplicate all that code?&lt;br /&gt;&lt;br /&gt;For instance, suppose we have another class "DiningSet" which consists of a Table and a set of Chairs. Do we call the Table constructor with the same default values within the DiningSet constructor?&lt;br /&gt;&lt;br /&gt;Of course, many of you might now be thinking, "just create an empty constructor which invokes the parameterized constructor with the default values; done". All well and good because your language likely supports the int type very well. Now suppose that constructor needs an object that cannot be just constructed at will from within application code, such as an existing object in the database.&lt;br /&gt;&lt;br /&gt;Enter factories:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public interface IProductFactory&lt;br /&gt;{&lt;br /&gt;  Product Make();&lt;br /&gt;}&lt;br /&gt;public sealed class TableFactory : IProductFactory&lt;br /&gt;{&lt;br /&gt;  int defaultLength;&lt;br /&gt;  int defaultWidth;&lt;br /&gt;  public Product Make()&lt;br /&gt;  {&lt;br /&gt;    return new Table(defaultLength, defaultWidth);&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;The IProductFactory abstract all factories which construct products. Any parameters that the base Product class accepts in its constructor are passed in to the Make() method, as this is shared across all Product Factories. TableFactory is mapped to a table with a single record containing the default length and width values. If the constructor requires an existing database object, this can be referenced via a foreign key constraint, and the O/R mapper will load the object reference and its dependencies for you.&lt;br /&gt;&lt;br /&gt;Since factories are generally singletons, it would be nice if O/R mappers provided special loading functions:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public interface ISession&lt;br /&gt;{&lt;br /&gt;  T Load&amp;lt;T&amp;gt;(object id);&lt;br /&gt;  T Singleton&amp;lt;T&amp;gt;();&lt;br /&gt;}&lt;/pre&gt;This models and O/R mapper session interface after the one in NHibernate. Note that a special Singleton() method simply loads the singleton of the given type without needing an object identifier.&lt;br /&gt;&lt;br /&gt;Our controller code is thus reduced to:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;...&lt;br /&gt;Product table = session.Singleton&amp;lt;TableFactory&amp;gt;().Make();&lt;br /&gt;...&lt;/pre&gt;Which encapsulates all the constructor details in application objects, does not hardcode any default values since they live in the database and can be upgraded on the fly, isolates refactorings which alter the Table constructor interface to the TableFactory alone, and simplifies controller code as we don't need to load any objects. This is a "pure" object-oriented design, in that the application can almost bootstrap itself, instead of relying on its environment to properly endow it with "god-given" defaults.&lt;br /&gt;&lt;br /&gt;This approach also enables another useful application pattern which I may describe in a future post.&lt;br /&gt;&lt;br /&gt;[Edit: I've just realized that the above is misleading in some parts, so I'll amend soon. Singletons aren't needed as much as I suggest above.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-3705754907225124429?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/3705754907225124429/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=3705754907225124429' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/3705754907225124429'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/3705754907225124429'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/04/objectrelational-mapping-and-factories.html' title='Object/Relational Mapping and Factories'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8683914230986557558</id><published>2008-04-01T13:02:00.000-04:00</published><updated>2011-09-26T02:13:30.841-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='C'/><category scheme='http://www.blogger.com/atom/ns#' term='libraries'/><title type='text'>Permutations with Duplicates in C</title><content type='html'>Calculating permutations has some fairly nifty algorithms out there. I recently ran into a permutation problem for which I couldn't find an existing algorithm. I admit that I didn't look too hard though. Basically, I needed the permutations of the elements of a set of size N over K slots. However, the permutations should include duplicate elements from the set, as K &amp;gt; N is valid configuration. This corresponds to N&lt;sup&gt;K&lt;/sup&gt; permutations. Most algorithms I found did not permit duplicate elements.&lt;br /&gt;&lt;br /&gt;As an example of an application for such a permutation algorithm, imagine the set of all function signatures of arity K-1 over N types. This corresponds to K slots with N possible choices for each slot.&lt;br /&gt;&lt;br /&gt;I devised a fairly simple implementation of such a permutation algorithm. Essentially, N forms the base of an arbitrary-precision integer of size K. In other words, we have an array of elements with a maximum of N which index our set. To permute, we simply increment the first element and propagate any overflow across the rest of the array. If the carry is 1 when we're done iterating over the whole array, then we're done generating permutations.&lt;br /&gt;&lt;br /&gt;Calculating permutations has been reduced to counting! Here is the C code:&lt;br /&gt;&lt;pre class="brush: c"&gt;#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;&lt;br /&gt;void print(const unsigned *slots, const unsigned K)&lt;br /&gt;{&lt;br /&gt;  unsigned i;&lt;br /&gt;  for (i = 0; i &amp;lt; K; ++i) {&lt;br /&gt;    printf("%4d", slots[i] );&lt;br /&gt;  }&lt;br /&gt;  printf("\n");&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;unsigned incr(unsigned *slots, const unsigned K, const unsigned N) {&lt;br /&gt;  unsigned i, carry;&lt;br /&gt;  print(slots, K);&lt;br /&gt;  for (i=0, carry=1; i &amp;lt; K; ++i) {&lt;br /&gt;    unsigned b = slots[i] + carry;&lt;br /&gt;    carry = b/N;&lt;br /&gt;    slots[i] = b % N;&lt;br /&gt;  }&lt;br /&gt;  return !carry;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;void count(const unsigned N, const unsigned K) {&lt;br /&gt;  unsigned i;&lt;br /&gt;  unsigned *slots = calloc(K, sizeof(unsigned));&lt;br /&gt;  while(incr(slots, K, N)) {&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;//output:&lt;br /&gt;//   0   0   0&lt;br /&gt;//   1   0   0&lt;br /&gt;//   0   1   0&lt;br /&gt;//   1   1   0&lt;br /&gt;//   0   0   1&lt;br /&gt;//   1   0   1&lt;br /&gt;//   0   1   1&lt;br /&gt;//   1   1   1&lt;br /&gt;int main(int argc, char ** argv) {&lt;br /&gt;  count(2, 3);&lt;br /&gt;  getchar();&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The only assumption is that N is less than UINT_MAX. Also, for some insane reason that I cannot fathom, I can't get the slots to print in reverse order. The trivial reversal of the print loop induces an access violation under Windows:&lt;br /&gt;&lt;pre class="brush: c"&gt;void print(const unsigned *slots, const unsigned K)&lt;br /&gt;{&lt;br /&gt;  unsigned i;&lt;br /&gt;  for (i = K-1; i &amp;gt;= 0; --i) {&lt;br /&gt;    printf("%4d", slots[i] );&lt;br /&gt;  }&lt;br /&gt;  printf("\n");&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;The mind boggles.&lt;br /&gt;&lt;br /&gt;[Edit: a commenter pointed out the problem, so the solution can be found in the comments.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8683914230986557558?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8683914230986557558/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8683914230986557558' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8683914230986557558'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8683914230986557558'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/04/permutations-with-duplicates-in-c.html' title='Permutations with Duplicates in C'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-537957412787411152</id><published>2008-02-11T13:53:00.000-05:00</published><updated>2011-09-26T02:09:51.162-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='logic puzzles'/><title type='text'>Blue-eyed Islander Puzzle - an analysis</title><content type='html'>Many people find themselves stumped by the so-called &lt;a href="http://terrytao.wordpress.com/2008/02/05/the-blue-eyed-islanders-puzzle/"&gt;Blue-Eyed Islanders puzzle&lt;/a&gt;. There is also much controversy over its supposed solution. I'm going to analyze the problem and the solution, and in the process, explain why the solution works.&lt;br /&gt;&lt;br /&gt;To begin, let's modify the problem slightly and say that there's only 1 blue-eyed islander. When the foreigner makes his pronouncement, the blue-eyed islander looks around and sees no other blue eyes, and being logical, correctly deduces that his own eyes must be blue in order for the foreigner's statement to make sense. The lone blue-eyed islander thus commits suicide the following day at noon.&lt;br /&gt;&lt;br /&gt;Now comes the tricky part, and the source of much confusion. Let's say there are 2 blue-eyed islanders, Mort and Bob. When the foreigner makes his pronouncement, Mort and Bob look around and see only each other. Mort and Bob thus both temporarily assume that the other will commit suicide the following day at noon. Imagine their chagrin when they gather in the village square at noon the next day, and Mort and Bob both look at each other in surprise because Mort thought Bob was going to commit suicide, and Bob thought the same of Mort! Now both Mort AND Bob know their own eye colour is blue, and they will both commit suicide on day 2.&lt;br /&gt;&lt;br /&gt;The very same argument can be extended to 3 blue-eyed islanders, Mort, Bob and Sue, who will commit suicide on the third day at noon. The day of the pronouncement, the three of them see each other, and Sue assumes Mort and Bob see only each other. Being logical, she thus deduces that they will commit suicide on the second day, by the above argument. Mort and Bob each see the same number of blue eyes as Sue, and thus reach the very same conclusions.&lt;br /&gt;&lt;br /&gt;Imagine Sue's chagrin, when she gathers in the village square on the second day, and Mort and Bob are both surprised that she's not committing suicide! She now knows that her eye colour is also blue, and all three of them will kill themselves on the third day.&lt;br /&gt;&lt;br /&gt;This inductive argument can be generalized to N blue eyed islanders, where all N of them will suicide on the Nth day after the pronouncement. QED.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-537957412787411152?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/537957412787411152/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=537957412787411152' title='22 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/537957412787411152'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/537957412787411152'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/02/blue-eyed-islander-puzzle-analysis.html' title='Blue-eyed Islander Puzzle - an analysis'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>22</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-4371384877022838709</id><published>2008-01-19T14:27:00.000-05:00</published><updated>2011-09-26T01:49:34.043-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>An Almost Type-Safe General Monad in C#, aka how to Abstract over Type Constructors using Dynamics</title><content type='html'>Extending the work in &lt;a href="http://higherlogics.blogspot.com/2008/01/worst-monad-tutorial-except-for-all.html"&gt;my  last post&lt;/a&gt;, I've developed a way to express &lt;a href="http://fpsharp.svn.sourceforge.net/viewvc/fpsharp/trunk/FP.Monad/Alternate/Monad.cs?view=markup"&gt;an almost type-safe, general monad in C#&lt;/a&gt;. Similar to my module translation, the single monad object becomes a pair of co-operating objects, only one of which the monad implementor must define. Since C# cannot abstract over type constructors, I had to exploit the only feature that could accomodate the flexibility I needed: C#'s dynamic typing.&lt;br /&gt;&lt;pre class="brush: csharp"&gt;// The Monad object, indexed by a singleton type that implements the&lt;br /&gt;// monad operations.&lt;br /&gt;public sealed class Monad&amp;lt;M, T&amp;gt;&lt;br /&gt;        where M : struct, IMonadOps&amp;lt;M&amp;gt; { ... }&lt;br /&gt;&lt;br /&gt;// An object that implements operations on the monad's encapsulated&lt;br /&gt;// state.&lt;br /&gt;public interface IMonadOps&amp;lt;M&amp;gt;&lt;br /&gt;   where M : struct, IMonadOps&amp;lt;M&amp;gt;&lt;br /&gt;{&lt;br /&gt;   /// Return the encapsulated state for the monad's zero value.&lt;br /&gt;   object Zero&amp;lt;T&amp;gt;();&lt;br /&gt;&lt;br /&gt;   // Return the encapsulated state for the 'unit' operation.&lt;br /&gt;   object Unit&amp;lt;T&amp;gt;(T t);&lt;br /&gt;&lt;br /&gt;   // Perform a bind operation given the monad's encapsulated state,&lt;br /&gt;   // and the binding function f.&lt;br /&gt;   Monad&amp;lt;M, R&amp;gt; Bind&amp;lt;T, R&amp;gt;(&lt;br /&gt;     object state,&lt;br /&gt;     Fun&amp;lt;T, Monad&amp;lt;M, R&amp;gt;&amp;gt; f,&lt;br /&gt;     Fun&amp;lt;Monad&amp;lt;M, R&amp;gt;, object&amp;gt; from,&lt;br /&gt;     Fun&amp;lt;object, Monad&amp;lt;M, R&amp;gt;&amp;gt; to);&lt;br /&gt;}&lt;/pre&gt;Looks a bit complicated, I know, but every piece is well-motivated. IMonadOps is fortunately the only interface that new monads must implement. Note the type constraints. The interface and the general monad both have a "phantom" or "witness" type M, constraining the type of the monad operations to the same IMonadOps type that created the monad. This means that the monad is entirely closed to extension and inspection.&lt;br /&gt;&lt;br /&gt;Each IMonadOps is effectively a stateless singleton. The constraint to a struct is merely an optimization. Every IMonadOps implementor is actually very similar to the body of an ML module. If I want to invoke Identity.Zero, I can do it thusly:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;default(Identity).Zero&amp;lt;int&amp;gt;();&lt;/pre&gt;This reveals the magic I used to define the Monad type. Whenever the Monad&amp;lt;M,T&amp;gt; type needs to invoke the underlying monad operations, it invokes the operation on the corresponding singleton M just like the above. Indexing the Monad by an IMonadOps implementation M, is like a linking step between Monad and M.&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public sealed class Monad&amp;lt;M, T&amp;gt;&lt;br /&gt;    where M : struct, IMonadOps&amp;lt;M&amp;gt;&lt;br /&gt;{&lt;br /&gt;    object state;&lt;br /&gt;&lt;br /&gt;    Monad(object state)&lt;br /&gt;    {&lt;br /&gt;        this.state = state;&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    // The projection function.&lt;br /&gt;    static object get&amp;lt;R&amp;gt;(Monad&amp;lt;M, R&amp;gt; m)&lt;br /&gt;    {&lt;br /&gt;        return m.state;&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    // The injection function.&lt;br /&gt;    static Monad&amp;lt;M, R&amp;gt; ctor&amp;lt;R&amp;gt;(object state)&lt;br /&gt;    {&lt;br /&gt;        return new Monad&amp;lt;M, R&amp;gt;(state);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    // The standard 'bind' operation. It dispatches to the&lt;br /&gt;    // type-indexed 'bind', defined by M.&lt;br /&gt;    public Monad&amp;lt;M, R&amp;gt; Bind&amp;lt;R&amp;gt;(Fun&amp;lt;T, Monad&amp;lt;M, R&amp;gt;&amp;gt; f)&lt;br /&gt;    {&lt;br /&gt;        return default(M).Bind&amp;lt;T, R&amp;gt;(state, f, get&amp;lt;R&amp;gt;, ctor&amp;lt;R&amp;gt;);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    // The standard 'unit' operation, injecting a value into&lt;br /&gt;    // the monad. It dispatches to the type-indexed unit&lt;br /&gt;    // function defined by M.&lt;br /&gt;    public static Monad&amp;lt;M, T&amp;gt; Unit(T t)&lt;br /&gt;    {&lt;br /&gt;        return new Monad&amp;lt;M, T&amp;gt;(default(M).Unit&amp;lt;T&amp;gt;(t));&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public static Monad&amp;lt;M, T&amp;gt; Zero()&lt;br /&gt;    {&lt;br /&gt;        return new Monad&amp;lt;M, T&amp;gt;(default(M).Zero&amp;lt;T&amp;gt;());&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;You can see the dispatching at work here. All monad operations are dispatched to the "linked" methods of M. In a sense, we have succeeded in abstracting over the concrete implementation of Monad.&lt;br /&gt;&lt;br /&gt;Now comes the catch: it's not fully type-safe, because the encapsulated state of the monad must be stored as 'object'. This means that each monad body must ensure it properly casts to and from the appropriate type. This is again due to the type constructor abstraction limitation.&lt;br /&gt;&lt;br /&gt;Your first thought might be, why can't the encapsulated state simply be 'T'? Well, if all you wanted was an Identity monad, then that would be fine. But consider the Maybe monad:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public struct Maybe : IMonadOps&amp;lt;Maybe&amp;gt;&lt;br /&gt;{&lt;br /&gt;    public object Unit&amp;lt;T&amp;gt;(T t)&lt;br /&gt;    {&lt;br /&gt;        return new Option&amp;lt;T&amp;gt;(t);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public object Zero&amp;lt;T&amp;gt;()&lt;br /&gt;    {&lt;br /&gt;        return new Option&amp;lt;T&amp;gt;();&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public Monad&amp;lt;Maybe, R&amp;gt; Bind&amp;lt;T, R&amp;gt;(&lt;br /&gt;      object state,&lt;br /&gt;      Fun&amp;lt;T, Monad&amp;lt;Maybe, R&amp;gt;&amp;gt; f,&lt;br /&gt;      Fun&amp;lt;Monad&amp;lt;Maybe, R&amp;gt;, object&amp;gt; from,&lt;br /&gt;      Fun&amp;lt;object, Monad&amp;lt;Maybe, R&amp;gt;&amp;gt; to)&lt;br /&gt;    {&lt;br /&gt;        Option&amp;lt;T&amp;gt; value = (Option&amp;lt;T&amp;gt;)state;&lt;br /&gt;        return (value.HasValue) ? f(value.Value) : Fail&amp;lt;R&amp;gt;();&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public static Monad&amp;lt;Maybe, T&amp;gt; Fail&amp;lt;T&amp;gt;()&lt;br /&gt;    {&lt;br /&gt;        return Monad&amp;lt;Maybe, T&amp;gt;.Zero();&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;The encapsulated state is an Option of type T, not a T. As another example, the List monad encapsulates a list of T's. Since we can't abstract over the type constructor for the encapsulated state, we thus need to resort to dynamic typing.&lt;br /&gt;&lt;br /&gt;Now comes a slightly bizarre part: what are the injection and projection functions for? Well, despite the fact that IMonadOps is the "internal implementation" of Monad, it doesn't have direct access to the monad's internals. Unfortunately, sometimes that access is needed. Consider the List monad:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public struct ListM : IMonadOps&amp;lt;ListM&amp;gt;&lt;br /&gt;{&lt;br /&gt;    public object Unit&amp;lt;T&amp;gt;(T t)&lt;br /&gt;    {&lt;br /&gt;        return List.Cons&amp;lt;T&amp;gt;(t, List.Nil&amp;lt;T&amp;gt;());&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public object Zero&amp;lt;T&amp;gt;()&lt;br /&gt;    {&lt;br /&gt;        return List.Nil&amp;lt;T&amp;gt;();&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public Monad&amp;lt;ListM, R&amp;gt; Bind&amp;lt;T, R&amp;gt;(&lt;br /&gt;      object state,&lt;br /&gt;      Fun&amp;lt;T, Monad&amp;lt;ListM, R&amp;gt;&amp;gt; f,&lt;br /&gt;      Fun&amp;lt;Monad&amp;lt;ListM, R&amp;gt;, object&amp;gt; from,&lt;br /&gt;      Fun&amp;lt;object, Monad&amp;lt;ListM, R&amp;gt;&amp;gt; to)&lt;br /&gt;    {&lt;br /&gt;        return to(&lt;br /&gt;            List.MapFlat&amp;lt;T, R&amp;gt;(&lt;br /&gt;                state as List.t&amp;lt;T&amp;gt;, delegate(T t)&lt;br /&gt;                {&lt;br /&gt;                    return from(f(t)) as List.t&amp;lt;R&amp;gt;;&lt;br /&gt;                }));&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;ListM needs access to the private state of the returned list of Monads in order to flatten the list, but that access is not permitted since Monad is a separate, encapsulated type. There is no way to make this state available using inheritance or access modifiers, without also permitting the state to escape inadvertently.&lt;br /&gt;&lt;br /&gt;Instead, the Monad provides an injection/projection pair, which are used to construct monad instances when given private state, or read out the private state of a monad instance, respectively. Note that encapsulation is maintained since this ability is granted only to the implementor of a given monad, which is already trusted with its own state.&lt;br /&gt;&lt;br /&gt;I suspect there's a more efficient way to share the monad's state, but I'm a little tired from standing on my head all day for C#, so if anyone has any ideas, I welcome them. :-)&lt;br /&gt;&lt;br /&gt;While this encoding is less efficient than the one described in my previous post, it's safer in some ways for users monad implementors alike, and I proved to myself that C#'s type system is powerful enough to encode the monad interface if you contort yourself appropriately. This technique for abstracting over type constructors might even be usable in &lt;a href="http://higherlogics.blogspot.com/2008/01/ml-modules-in-c-sorely-missing.html"&gt;my tagless Orc implementation&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-4371384877022838709?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/4371384877022838709/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=4371384877022838709' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4371384877022838709'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/4371384877022838709'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/01/almost-type-safe-general-monad-in-c-aka.html' title='An Almost Type-Safe General Monad in C#, aka how to Abstract over Type Constructors using Dynamics'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-5509086916233568776</id><published>2008-01-18T19:33:00.000-05:00</published><updated>2011-09-26T01:59:05.238-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>The Worst Monad Tutorial... Except For All Those Others.</title><content type='html'>I've found other monad tutorials very frustrating. They are typically written in expressive languages with type inference, which permits concise descriptions, but obscures the underlying type structure. I've been struggling with writing something close to a monad in C# for quite some time, simply because none of these tutorials give a sufficiently complete description of a monad's structure. Suprisingly, the &lt;a href="http://en.wikipedia.org/wiki/Monads_in_functional_programming"&gt;Wikipedia page on monads helped clarify&lt;/a&gt; what I was missing.&lt;br /&gt;&lt;br /&gt;Here is &lt;a href="http://www.haskell.org/all_about_monads/html/meet.html#maybe"&gt;the general structure of a monad all these tutorials use&lt;/a&gt;:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;-- the type of monad m&lt;br /&gt;type m a = ... &lt;br /&gt;&lt;br /&gt;-- return is a type constructor that creates monad instances &lt;br /&gt;return :: a &amp;rarr; m a&lt;br /&gt;&lt;br /&gt;-- bind is a function that combines a monad instance m a with a&lt;br /&gt;-- computation that produces another monad instance m b from a's&lt;br /&gt;-- to produce a new monad instance m b&lt;br /&gt;bind :: m a &amp;rarr; (a &amp;rarr; m b) &amp;rarr; m b&lt;/pre&gt;&lt;br /&gt;So the monad type 'm', has a function 'return' that constructs instances of that type, and 'bind' which converts an instance of that monad into another instance of a monad by passing its private state to the provided function. For instance, meet Haskell's &lt;a href="http://www.haskell.org/all_about_monads/html/identitymonad.html"&gt;Identity monad&lt;/a&gt;:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;instance Monad Identity where&lt;br /&gt;    return a           = Identity a   -- i.e. return = id &lt;br /&gt;    (Identity x) &amp;gt;&amp;gt;= f = f x          -- i.e. x &amp;gt;&amp;gt;= f = f x&lt;/pre&gt;&lt;br /&gt;Looks simple enough. Now meet &lt;a href="http://www.haskell.org/all_about_monads/html/listmonad.html"&gt;Haskell's List monad&lt;/a&gt;:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;instance Monad [] where&lt;br /&gt;    bind m f  = concatMap f m&lt;br /&gt;    return x = [x]&lt;br /&gt;    fail s   = []&lt;/pre&gt;&lt;br /&gt;What's not at all obvious from any of the above signatures, or any of the existing tutorials, is that the monad that the function f returns, must be &lt;strong&gt;&lt;em&gt;the same monad type&lt;/em&gt;&lt;/strong&gt;. If you're in the list monad, f must return a new instance of the list monad. If you're in the identity monad, f must return a new instance of the identity monad. This simple fact eluded me for quite some time, and explains why the monad interface cannot be expressed in C#. We can still program using monads, but the interface can't be enforced by the type system.&lt;br /&gt;&lt;br /&gt;So without further ado, here is the &lt;a href="http://fpsharp.svn.sourceforge.net/viewvc/fpsharp/trunk/FP.Monad/Monads/Identity.cs?view=markup"&gt;Identity monad in C#&lt;/a&gt;:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public class Identity&amp;lt;T&amp;gt; : Monad&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  T value;&lt;br /&gt;&lt;br /&gt;  public Identity(T t)&lt;br /&gt;  {&lt;br /&gt;    value = t;&lt;br /&gt;  }&lt;br /&gt;  public Identity&amp;lt;B&amp;gt; Bind&amp;lt;B&amp;gt;(Fun&amp;lt;T, Identity&amp;lt;B&amp;gt;&amp;gt; f)&lt;br /&gt;  {&lt;br /&gt;    return f(value);&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;The &lt;a href="http://fpsharp.svn.sourceforge.net/viewvc/fpsharp/trunk/FP.Monad/Monad.cs?view=markup"&gt;Monad&amp;lt;T&amp;gt; base class is actually empty&lt;/a&gt;, so it's just a marker interface. For those of you unfamiliar with the "Fun" delegate, it's just one of the many standard delegates I use for function signatures in &lt;a href="http://fpsharp.svn.sourceforge.net/viewvc/fpsharp/trunk/FP.Types/Fn.cs?view=markup"&gt;my FP# library&lt;/a&gt;. Here is the &lt;a href="http://fpsharp.svn.sourceforge.net/viewvc/fpsharp/trunk/FP.Monad/Monads/List.cs?view=markup"&gt;List monad in C#&lt;/a&gt;:&lt;pre class="brush: csharp"&gt;public class ListMonad&amp;lt;T&amp;gt; : Monad&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;    protected List.t&amp;lt;T&amp;gt; l;&lt;br /&gt;&lt;br /&gt;    public ListMonad(T t)&lt;br /&gt;    {&lt;br /&gt;        l = t;&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public ListMonad()&lt;br /&gt;    {&lt;br /&gt;        l = List.Nil&amp;lt;T&amp;gt;();&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public ListMonad&amp;lt;B&amp;gt; Bind&amp;lt;B&amp;gt;(Fun&amp;lt;T, ListMonad&amp;lt;B&amp;gt;&amp;gt; f)&lt;br /&gt;    {&lt;br /&gt;        return new ListMonad&amp;lt;B&amp;gt;(&lt;br /&gt;            List.MapFlat&amp;lt;T, B&amp;gt;(&lt;br /&gt;                l, delegate(T t) { return f(t).l; }));&lt;br /&gt;    }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Note that all of these monads in C# share a common structure. They have at least one constructor used to wrap a value (called 'return' earlier), and they all have a Bind method which operates on the value encapsulated in the monad, and maps it to a new instance of the monad. Since we have such a similar structure, is there a way to declare an interface or an abstract base class declaring the signature for the Bind method?&lt;br /&gt;&lt;br /&gt;Unfortunately not, because &lt;a href="http://higherlogics.blogspot.com/2008/01/ml-modules-in-c-sorely-missing.html"&gt;C# cannot abstract over type constructors&lt;/a&gt;. If it could, the abstract monad class and the Identity monad would look something like:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public abstract class Monad&amp;lt;M,T&amp;gt; where &lt;em&gt;&lt;strong&gt;M : Monad&lt;/strong&gt;&lt;/em&gt;&lt;br /&gt;{&lt;br /&gt;  public abstract M&amp;lt;M, R&amp;gt Bind&amp;lt;R&amp;gt(Fun&amp;lt;T, M&amp;lt;M, R&amp;gt&amp;gt; f);&lt;br /&gt;}&lt;br /&gt;public sealed class Identity&amp;lt;T&amp;gt; : Monad&amp;lt;&lt;em&gt;&lt;strong&gt;Identity&lt;/strong&gt;&lt;/em&gt;,T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  ...&lt;br /&gt;  public override Identity&amp;lt;R&amp;gt; Bind&amp;lt;R&amp;gt;(Fun&amp;lt;T, Identity&amp;lt;R&amp;gt;&amp;gt; f)&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Note the two emphasized sections: the class constraint on Monad, and the type parameter Identity provides to Monad when inheriting from it. They are both used without a type argument. This is illegal in C#/.NET, but it's perfectly legal in languages with more powerful type systems, such as those with "kinds". Identity&amp;lt;int&amp;gt; has kind *, while Identity without a type argument has type *&amp;rArr;*, ie. a function that constructs a type of kind * when given a type of kind *. This is why monads are not so easily translated into languages like C#.&lt;br /&gt;&lt;br /&gt;Coming soon, a real example of using monads in C#?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-5509086916233568776?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/5509086916233568776/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=5509086916233568776' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5509086916233568776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/5509086916233568776'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/01/worst-monad-tutorial-except-for-all.html' title='The Worst Monad Tutorial... Except For All Those Others.'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-7664526892055997839</id><published>2008-01-16T19:09:00.000-05:00</published><updated>2009-11-16T09:26:05.973-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>Towards the best collection API... in C#. And some partial applications too.</title><content type='html'>The venerable Oleg Kiselyov &lt;a href="http://okmij.org/ftp/papers/LL3-collections-enumerators.txt"&gt;once posted about the "best" collection traversal API&lt;/a&gt;. Let's call this ideal iterator a "SuperFold". &lt;a href="http://lambda-the-ultimate.org/node/1224"&gt;LTU also covered his article&lt;/a&gt;. Essentially, a SuperFold is a left fold with early termination support. Any cursor can then be automatically derived from the SuperFold. The converse is not true.&lt;br /&gt;&lt;br /&gt;Additional arguments are made in the above paper and in the LTU thread, so I won't repeat them here. Without further ado, I present the &lt;a href="http://fpsharp.svn.sourceforge.net/viewvc/fpsharp/trunk/FP.List/List.cs?revision=39&amp;amp;view=markup"&gt;SuperFold for my purely functional list in C#&lt;/a&gt;:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;//OCaml signature: ((T &amp;rarr; B &amp;rarr; bool * B) &amp;rarr; B &amp;rarr; B) &amp;rarr; (T &amp;rarr; B &amp;rarr; bool * B) &amp;rarr; B &amp;rarr; B&lt;br /&gt;B SuperFold&amp;lt;B&amp;gt;(&lt;br /&gt;  Fun&amp;lt;Fun&amp;lt;T, B, Pair&amp;lt;bool, B&amp;gt;&amp;gt;, B, B&amp;gt; self,&lt;br /&gt;  Fun&amp;lt;T, B, Pair&amp;lt;bool, B&amp;gt;&amp;gt; proc,&lt;br /&gt;  B seed)&lt;br /&gt;{&lt;br /&gt;  bool cont;&lt;br /&gt;  proc(head, seed).Bind(out cont, out seed);&lt;br /&gt;  return cont ? self(proc, seed) : seed;&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;While quite simple, it's not as efficient as it should be since C#/.NET doesn't support proper tail calls. You can see in that source file that I derived FoldLeftDerived from SuperFold. Deriving FoldRight is a bit trickier, so I have to think about it. The simple, inefficient, answer is to simply reverse the list.&lt;br /&gt;&lt;br /&gt;I've also enhanced the FP# library with:&lt;ol&gt;&lt;li&gt; A number of tuple types (up to 10),&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://fpsharp.svn.sourceforge.net/viewvc/fpsharp/trunk/FP.Types/Tuple.cs?view=markup"&gt;Tuple inference&lt;/a&gt;,just call: Tuple._(a,b,c,...) &lt;/li&gt;&lt;li&gt;&lt;a href="http://fpsharp.svn.sourceforge.net/viewvc/fpsharp/trunk/FP.List/List.cs?revision=39&amp;amp;view=markup"&gt;streamed versions of Map/Filter/FoldRight over IEnumerable&lt;/a&gt; which don't build intermediate lists,&lt;/li&gt;&lt;li&gt;FoldLeft over IEnumerable including support for early termination,&lt;/li&gt;&lt;li&gt;FoldLeft over my purely functional list including support for early termination,&lt;/li&gt;&lt;li&gt;Option type now looks more like System.Nullable, with an overload for the | operator to choose between an empty option, or a default value [1],&lt;/li&gt;&lt;li&gt;&lt;a href="http://fpsharp.svn.sourceforge.net/viewvc/fpsharp/trunk/FP.Types/Partial.cs?view=markup"&gt;Partial application for almost all defined function types&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;I'll probably remove SuperFold from my purely functional list, since I don't think it will end up being useful in C#. The iterated FoldLeft I defined is more efficient due to the lack of tail recursion, and the iterated version also supports early termination.&lt;br /&gt;&lt;br /&gt;[1] It's very frustrating to see how close MS gets to a truly general and useful abstraction, only to lock it down for no apparent reason. What good reason is there for Nullable to be restricted to struct types? If you give it more than a second's thought, I think you'll realize that there is no good reason. Nullable is the option type if it weren't for this restriction!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-7664526892055997839?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/7664526892055997839/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=7664526892055997839' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7664526892055997839'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7664526892055997839'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/01/towards-best-collection-api-in-c-and.html' title='Towards the best collection API... in C#. And some partial applications too.'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8909700212889074048</id><published>2008-01-08T11:43:00.000-05:00</published><updated>2011-09-26T01:58:40.275-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='tagless interpreters'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>ML Modules in C# - Sorely Missing Polymorphic Type Constructors</title><content type='html'>As &lt;a href="http://lambda-the-ultimate.org/node/2569#comment-39031"&gt;Chung-chieh Shan pointed out&lt;/a&gt;, my encoding of modules in C# is somewhat limited. In particular, I cannot abstract over type constructors, which is to say, C# is missing generics over generics. Consider the Orc.NET interpreter:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;class Orc {&lt;br /&gt;  class Exp&amp;lt;T&amp;gt; { ... }&lt;br /&gt;&lt;br /&gt;  public Exp&amp;lt;U&amp;gt; Seq&amp;lt;T,U&amp;gt;(Exp&amp;lt;T&gt; e1, &amp;lt;U&amp;gt;) { ... }&lt;br /&gt;  public Exp&amp;lt;T&amp;gt; Par&amp;lt;T&amp;gt;(Exp&amp;lt;T&amp;gt; e1, Exp&amp;lt;T&amp;gt;) { ... }&lt;br /&gt;  public Exp&amp;lt;T&amp;gt; Where&amp;lt;T&amp;gt;(Exp&amp;lt;T&amp;gt; e1, Exp&amp;lt;Promise&amp;lt;T&amp;gt;&amp;gt;) { ... }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;This is the result of my translation, which was necessitated by the "Where" method. Where introduces a dependency which currently cannot be expressed with ordinary C# constraints, so the module encoding is necessary.&lt;br /&gt;&lt;br /&gt;The above interface is a direct, faithful implementation of the Orc semantics. The implementation I currently have is an interpreter for those semantics. What if I want to provide a compiler instead? &lt;a href="http://lambda-the-ultimate.org/node/2438"&gt;The interface should remain the same&lt;/a&gt;, but the implementation should differ. This is a classic abstraction problem.&lt;br /&gt;&lt;br /&gt;It's clear that the implementations can be changed at compile-time, but providing a different Orc implementation at runtime, something very natural for OO languages, does not seem possible. The reason is that C# lacks the requisite polymorphism.&lt;br /&gt;&lt;br /&gt;A generic type is a type constructor, which means that from an abstract type definition, Exp&amp;lt;T&amp;gt;, you can construct a concrete type by providing a concrete type T, such as Exp&amp;lt;int&amp;gt;. But, if Exp&amp;lt;T&amp;gt; is not a concrete type until you provide it a concrete type T, what is the type of the abstract definition Exp&amp;lt;T&amp;gt;?&lt;br /&gt;&lt;br /&gt;Reasoning about type constructors requires the introduction of something called "kinds". As types classify values, so kinds classify types. The set of values, or concrete types, is of kind *. The set of type constructors is of kind *&amp;rArr;*, which is to say they are "compile-time functions" accepting a concrete type and producing a concrete type.&lt;br /&gt;&lt;br /&gt;Now consider that we have multiple Exp&amp;lt;T&amp;gt; implementations, say Interpreter.Orc.Exp and Compiler.Orc.Exp, all with different innards, and all define the same operations and thus are theoretically interchangeable. We would somehow like to abstract over the different implementations of Exp&amp;lt;T&amp;gt; so that we can use whichever one is most appropriate. In other words, we want to make our code polymorphic over a set of generic types Exp&amp;lt;T&amp;gt;.&lt;br /&gt;&lt;br /&gt;This necessitates type constructor constructors, the kind *&amp;rArr;*&amp;rArr;*, which accepts a type such as Orc.Compiler.Exp, and produces a type constructor Exp&amp;lt;T&amp;gt;.&lt;br /&gt;&lt;br /&gt;At the moment, I can't think of a way to encode such polymorphism in C#. Functional languages provide this level of polymorphism, and some even provide higher-order kinds, meaning type constructors can be arguments to other type constructors, thus achieving entirely new levels of expressiveness.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8909700212889074048?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8909700212889074048/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8909700212889074048' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8909700212889074048'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8909700212889074048'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2008/01/ml-modules-in-c-sorely-missing.html' title='ML Modules in C# - Sorely Missing Polymorphic Type Constructors'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8041367605490434924</id><published>2007-12-28T10:53:00.000-05:00</published><updated>2011-09-26T02:10:41.966-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>ML Modules in C#</title><content type='html'>I've &lt;a href="http://higherlogics.blogspot.com/2007/11/generalizing-c-generics.html"&gt;written about the limitations of C#'s equational constraints before&lt;/a&gt;. Truth is, I now believe that any such limits can be circumvented by a relatively simple translation. The result is less "object-oriented", as it requires a set of cooperating objects instead of being encapsulated in a single object.&lt;br /&gt;&lt;br /&gt;Let's take a simple, unsafe list flattening operation described in &lt;a href="http://higherlogics.blogspot.com/2007/04/general-pattern-matching-with-visitors.html"&gt;Generalized Algebraic Data Types and Object-Oriented Programming (GADTOOP)&lt;/a&gt;. This can be expressed in OCaml as:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;let List = struct &lt;br /&gt;  type 'a t = Nil | Cons of 'a * 'a t&lt;br /&gt;  let append l a = Cons(a, l)&lt;br /&gt;  let flatten la =&lt;br /&gt;      Cons(a, l) -&gt; append a (flatten l)&lt;br /&gt;    | Nil -&gt; Nil&lt;br /&gt;end&lt;/pre&gt;&lt;br /&gt;The argument to flatten, la, is a list of lists of type 'a. However, there is no way to express this in C# without unrestricted equational constraints as I described earlier. Here is the translation to C# from GADTOOP:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public abstract class List&amp;lt;T&amp;gt; {...&lt;br /&gt;  public abstract List&amp;lt;T&amp;gt; Append(List&amp;lt;T&amp;gt; that);&lt;br /&gt;  public abstract List&amp;lt;U&amp;gt; Flatten&amp;lt;U&amp;gt;();&lt;br /&gt;}&lt;br /&gt;public class Nil&amp;lt;A&amp;gt; : List&amp;lt;A&amp;gt; {...&lt;br /&gt;  public override List&amp;lt;U&amp;gt; Flatten&amp;lt;U&amp;gt;()&lt;br /&gt;  { return new Nil&amp;lt;U&amp;gt;(); }&lt;br /&gt;}&lt;br /&gt;public class Cons&amp;lt;A&amp;gt; : List&amp;lt;A&amp;gt; {...&lt;br /&gt;  A head; List&amp;lt;A&amp;gt; tail;&lt;br /&gt;  public override List&amp;lt;U&amp;gt; Flatten&amp;lt;U&amp;gt;()&lt;br /&gt;  { Cons&amp;lt;List&amp;lt;U&amp;gt;&amp;gt; This = (Cons&amp;lt;List&amp;lt;U&amp;gt;&amp;gt;) (object) this;&lt;br /&gt;    return This.head.Append(This.tail.Flatten&amp;lt;U&amp;gt;()); }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Note how Flatten requires an unsafe, ugly cast to and from object to get around C#'s type limitations. However, there is a safe way. Here is the translation:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;static class List&lt;br /&gt;{&lt;br /&gt;  public abstract class t&amp;lt;A&amp;gt; {}&lt;br /&gt;  public sealed class Nil&amp;lt;A&amp;gt; : t&amp;lt;A&amp;gt; {}&lt;br /&gt;  public sealed class Cons&amp;lt;A&amp;gt; : t&amp;lt;A&amp;gt; {&lt;br /&gt;    internal A head; internal t&amp;lt;A&amp;gt; tail;&lt;br /&gt;  }&lt;br /&gt;  public static t&amp;lt;A&amp;gt; Append(t&amp;lt;A&amp;gt; l, A a)&lt;br /&gt;  {  return new Cons&amp;lt;A&amp;gt;(a, l);  }&lt;br /&gt;  public static t&amp;lt;A&amp;gt; Flatten&amp;lt;A&amp;gt;(t&amp;lt;List&amp;lt;A&amp;gt;&amp;gt; l)&lt;br /&gt;  { return l.head is Nil ? new Nil&amp;lt;A&amp;gt;() : Append&amp;lt;A&amp;gt;(Flatten&amp;lt;A&amp;gt;(l.tail), l.head); }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;Basically, the step-wise translation I propose:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;move the generic class I into a nested generic class O.I&lt;br /&gt;&lt;pre class="brush: csharp"&gt;class I&amp;lt;T&amp;gt; {}&lt;br /&gt;&lt;/pre&gt;becomes&lt;pre class="brush: csharp"&gt;class O {&lt;br /&gt;  class I&amp;lt;T&amp;gt; {}&lt;br /&gt;}&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;move the instance methods of that generic class into generic methods of the enclosing class, I.m -&amp;gt; O.m&lt;br /&gt;&lt;pre class="brush: csharp"&gt;class I&amp;lt;T&amp;gt; {&lt;br /&gt;  I&amp;lt;T&amp;gt; m&amp;lt;U&amp;gt;(U u) { ... }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;becomes&lt;pre class="brush: csharp"&gt;class O {&lt;br /&gt;  class I&amp;lt;T&amp;gt; {}&lt;br /&gt;  I&amp;lt;T&amp;gt; m&amp;lt;T, U&amp;gt;(I&amp;lt;T&amp;gt; i, U u) { ... }&lt;br /&gt;}&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;mark all relevant state of I "internal", so the outer class O can access it.&lt;pre class="brush: csharp"&gt;class I&amp;lt;T&amp;gt; {&lt;br /&gt;  private T value;&lt;br /&gt;  public T get() { return value; }&lt;br /&gt;}&lt;/pre&gt;becomes&lt;pre class="brush: csharp"&gt;class O {&lt;br /&gt;  class I&amp;lt;T&amp;gt; { internal T value; }&lt;br /&gt;  public T get(I&amp;lt;T&amp;gt; i) { return i.value; }&lt;br /&gt;}&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;now you can express arbitrary constraints on the type structure in the methods of O.&lt;pre class="brush: csharp"&gt;class O {&lt;br /&gt;  class I&amp;lt;T&amp;gt; {}&lt;br /&gt;  I&amp;lt;T&amp;gt; m2&amp;lt;T, U&amp;gt;(I&amp;lt;I&amp;lt;T&amp;gt;&amp;gt; i, I&amp;lt;I&amp;lt;U&amp;gt;&amp;gt; u) { ... }&lt;br /&gt;  ...&lt;br /&gt;}&lt;/pre&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;Because the external methods do not depend on an implicit 'this' argument, we no longer need equational constraint refinements to express the required type structure. Ordinary C# types suffice!&lt;br /&gt;&lt;br /&gt;If you squint a little more closely, you'll also note a much closer symmetry in this version of the code and the OCaml code. In fact, this translation essentially builds ML modules in C#!&lt;br /&gt;&lt;br /&gt;Like all widely-used module systems, it's pretty clear that these modules are "second-class". Only rebinding the name is possible, via C#'s "using" directive, and everything is resolved and fixed at compile-time. By this, I mean that you can import different list implementations and it will compile cleanly as long as the other list implementation defines the same operations with the same type signatures:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;using L = List;&lt;br /&gt;...&lt;br /&gt;L.Flatten(l);&lt;br /&gt;...&lt;/pre&gt;Or:&lt;pre class="brush: csharp"&gt;using L = AnotherList;&lt;br /&gt;...&lt;br /&gt;L.Flatten(l);&lt;br /&gt;...&lt;/pre&gt;&lt;br /&gt;The compile-time restriction is primarily due to the declaration as a static class. Making the class non-static permits runtime binding of concrete implementations to signatures, so it's a little more flexible, and just as safe. Loosening this restriction may also make runtime "functors" possible.&lt;br /&gt;&lt;br /&gt;I've used this pattern to complete &lt;a href="http://sourceforge.net/projects/Orc-dotnet"&gt;Orc.NET&lt;/a&gt;, because the 'where' combinator required an inexpressible dependency. You can see my use in &lt;a href="http://orc-dotnet.svn.sourceforge.net/viewvc/orc-dotnet/branches/tagless/Interp.cs?view=markup"&gt;the Orc interpreter&lt;/a&gt;. The "Orc" object in Interp.cs is essentially an "expression builder", and I suspect that all such "builder" implementations are really ML modules at their core.&lt;br /&gt;&lt;br /&gt;An open question is the interaction of inheritance with such modules. Seems like inheritance is a particular type of functor from structure S to S.&lt;br /&gt;&lt;br /&gt;In any case, if you need type constraints which are inexpressible in C#, then make them ML modules using the above translation, and add object-orientedness back in incrementally. On a final note, I find it amusing that OO languages must resort to functional techniques to resolve fundamental OO limitations. I'd much prefer if we could just use functional languages instead and forgo all the hassle. ;-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8041367605490434924?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8041367605490434924/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8041367605490434924' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8041367605490434924'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8041367605490434924'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/12/ml-modules-in-c.html' title='ML Modules in C#'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8963364956488185606</id><published>2007-11-12T11:40:00.000-05:00</published><updated>2011-09-26T01:50:10.790-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>Generalizing C# Generics</title><content type='html'>In previous posts, I had commented on certain &lt;a href="http://higherlogics.blogspot.com/2007/05/gexl-lives.html"&gt;non-sensical limitations in the C# type system, particularly with regard to equational constraints on generic type parameters&lt;/a&gt;; these unfortunate limitations significantly reduce the expressiveness of well-typed solutions.&lt;br /&gt;&lt;br /&gt;Microsoft Research had actually already tackled the problem in their 2006 paper &lt;a href="http://research.microsoft.com/research/pubs/view.aspx?type=inproceedings&amp;id=1215"&gt;Variance and Generalized Constraints for C# Generics&lt;/a&gt;. Taking inspiration from Scala, they generalize class and method parameter constraints with arbitrary subtyping relations, and they further add use-constraints on generic methods. This increased expressiveness should address the problems I alluded to in my previous posts; if only the changes were integrated into the .NET VM and C#... :-)&lt;br /&gt;&lt;br /&gt;[Edit: figures that &lt;a href="http://lambda-the-ultimate.org/node/1573"&gt;LTU already covered this paper&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8963364956488185606?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8963364956488185606/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8963364956488185606' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8963364956488185606'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8963364956488185606'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/11/generalizing-c-generics.html' title='Generalizing C# Generics'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-7614874435939962730</id><published>2007-10-24T09:00:00.000-04:00</published><updated>2011-09-26T01:37:37.571-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><title type='text'>On the Importance of Purity</title><content type='html'>The benefits of advanced programmings languages are sometimes difficult to grasp for everyday programmers. The features of such languages and how they relate to industrial software development are sometimes hard to understand, especially since the arguments are couched in terms such as "referential transparency", "totality", "side-effect-free", "monads", "non-determinism", "strong static typing", "algebraic data types", "higher-order functions", "laziness/call-by-need", and so on.&lt;br /&gt;&lt;br /&gt;Many of these features are attributed to "pure" languages, but purity is also a nebulous concept. I will explain the importance of a number of these features and how they impact the everyday programmer's life.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;The Benefits of Referential Transparency&lt;/h2&gt;&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Referential_transparency"&gt;Referential transparency (RT)&lt;/a&gt; is a simple principle with profound consequences. Essentially, RT dictates that functions may access only the parameters they're given, and the only effect functions may have, is to return a value.&lt;br /&gt;&lt;br /&gt;This may not sound very significant, but permitting functions which are not&lt;br /&gt;RT is literally a nightmare for software developers. This fact is simply not obvious because most mainstream languages are not referentially transparent.&lt;br /&gt;&lt;br /&gt;Consider a fairly benign scenario where you're in a loop and you call a function, but the function suddenly changes your loop index variable even though you didn't pass it to the function! This generally doesn't happen, but it could happen in languages like C/C++. The only reason it's not prevalent is because functions are RT regarding local variables allocated on the stack, and most languages other than C/C++ enforce this property.&lt;br /&gt;&lt;br /&gt;I'm sure it's not difficult to imagine the horrors if any function you call could modify any of your local variables: you could no longer rely on the variables holding correct values at any time. How could you rely on anything? Global variables are a good example of this mess. It's common wisdom that one should avoid globals. Why? Because they're not RT, and since their values could change at any time, they're unreliable.&lt;br /&gt;&lt;br /&gt;If a client calls you with a problem in your program written in a RT language, you immediately know that the problem is exactly in module B providing that feature, and perhaps even the particular function B.F performing the given operation. Instead, in non-RT languages a completely different module C could be interfering with module B by changing its state behind the scenes. Debugging is thus much easier with RT.&lt;br /&gt;&lt;br /&gt;Consider a more malicious scenario, where a function does some useful computation, but also deletes all of your files behind your back. You download this library because it seems useful, only to lose your entire machine. Even worse, it may only deletes files when deployed to a server. Or perhaps it installs some adware. This situation is only possible because the function is not referentially transparent. If it were, the only way it could have deleted your files is if you had given it a handle to a directory.&lt;br /&gt;&lt;br /&gt;Those of you well-versed in the security field might recognize this constraint: it is the same authority propagation constraint underlying &lt;a href="http://www.erights.org/elib/capability/ode/ode-capabilities.html"&gt;capability security&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;However, the properties of full referential transparency are stronger than those of capability security: where the latter permits non-determinism as long as the facility providing it is accessed via a capability, full referential transparency requires "pure functions", which are fully deterministic.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;The Benefits of Determinism&lt;/h2&gt;&lt;br /&gt;So why is determinism important? It's important that everything in your program is &lt;em&gt;reproducible&lt;/em&gt;, and thus testable. No arguments with that statement I'm sure. &lt;br /&gt;&lt;br /&gt;Consider a particular function, that whenever called with the same parameters, it returns a completely different value. How could you ever produce a useful program if every function were like that? How could you even test it?&lt;br /&gt;&lt;br /&gt;Most developers have experienced the problems of non-determinism when a client calls in with a problem, but the problem is not reproducible even by retracing the exact same steps. If the program was deterministic, then retracing those exact steps would &lt;em&gt;always&lt;/em&gt; produce the error, no exception. I can't understate how essential reproducibility is for testing and quality assurance purposes.&lt;br /&gt;&lt;br /&gt;However, non-deterministic functions do exist. We've all used such functions at one time or another. Consider a function that returns the value of the clock, or consider the random() function. If we want to use a RT, deterministic language, we &lt;em&gt;need&lt;/em&gt; some way to use non-deterministic functions. How do we reconcile these two conflicting ends? There are a number of different ways, but all of them involve &lt;em&gt;controlling and isolating&lt;/em&gt; non-determinism in some way.&lt;br /&gt;&lt;br /&gt;For instance, most capability languages permit non-determinism, but the source of non-determinism can only be accessed via a capability which must be explicitly granted. Thus, you know that only the modules that were granted access to a source of non-determinism can behave non-deterministically and every other module in the program is completely deterministic. What a relief when debugging!&lt;br /&gt;&lt;br /&gt;Essentially, this means that any source of non-determinism cannot be globally accessible, or &lt;em&gt;ambient&lt;/em&gt;. So for you C/C++ programmers, rand() and getTime() are not global functions, they are function pointers that must be passed explicitly to the modules that need them. Only main() has access to all sources of non-determinism, and main() will pass on the appropriate pointers to the authorized modules.&lt;br /&gt;&lt;br /&gt;Purely functional languages like Haskell take a similar approach to capability languages. RT is absolutely essential to purely functional languages, and any source of non-determinism violates RT. Such languages have a construct which was mathematically proven to isolate and control non-determinism: monads. It's not important what a monad is, just consider it as a design pattern for purely functional languages that preserves RT in the presence of non-determinism.&lt;br /&gt;&lt;br /&gt;I would say that determinism is strictly more important than RT, but that RT is currently the best known method of achieving determinism. Now if your client calls you with an irreproducible problem, at least you've narrowed the field to only the modules that use sources of non-determinism.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;The Benefits of Totality&lt;/h2&gt;&lt;br /&gt;What is a total function? A function is total if it is defined for &lt;em&gt;all possible&lt;/em&gt; values of its inputs, no exception. Integer addition is a total function: regardless of the values of the two input integers, adding them produces a valid result.&lt;br /&gt;&lt;br /&gt;By contrast, many functions are &lt;a href="http://en.wikipedia.org/wiki/Total_function"&gt;partial functions&lt;/a&gt;. Here, "partial" means that not all values passed to the function are valid. For instance, integer division is defined for all integers &lt;em&gt;except zero&lt;/em&gt;. Dividing by zero is considered an error and its behaviour is undefined. Division is thus a partial function.&lt;br /&gt;&lt;br /&gt;Using total functions always results in defined behaviour. Using partial functions  sometimes results in &lt;em&gt;undefined behaviour&lt;/em&gt;, if they are applied to invalid values. Thus, the more total functions you use, the more likely your program will run without generating errors.&lt;br /&gt;&lt;br /&gt;If a language &lt;em&gt;forces&lt;/em&gt; all of your functions to be total, then your program will have &lt;em&gt;no undefined behaviour at all&lt;/em&gt;. You are forced to consider and handle all possible error conditions and edge cases, in addition to the program' expected normal operating mode. &lt;em&gt;No undefined or unknown behaviour is possible&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;As you can imagine, totality inevitably produces more reliable software. In C you are free to ignore errors and segfault your program, but with total functions you can't ignore those errors.&lt;br /&gt;&lt;br /&gt;Unfortunately, totality can be a serious burden on the developer, which is why partial programming is more prevalent. Exceptions were invented to help deal with invalid inputs to partial functions: don't segfault, throw an exception! They do permit more modular code to be written, since errors can propagate and be handled at higher levels. Unfortunately, unchecked exceptions, where the exceptions a function can throw are not defined in the function's signature, just bring us back to square one: uncaught exceptions  result in undefined behaviour, but the language doesn't help us by telling us what exceptions we need to catch.&lt;br /&gt;&lt;br /&gt;Correct software inevitably requires totality of some sort. How do we transform a partial function into a total function? Here's how we can make division total:&lt;br /&gt;&lt;pre&gt;fun divide(n,d) : int * int -&gt; int&lt;/pre&gt;&lt;br /&gt;This snippet is the signature for the division function. It takes two integers, n and d, and returns an integer. No mention is made of the error condition in the signature when d=0. To make divide total, we transform it to the following:&lt;br /&gt;&lt;pre&gt;data Result = Defined(x) | Undefined&lt;br /&gt;fun divide(n,d) : int * int -&gt; Result = &lt;br /&gt; if d == 0 then&lt;br /&gt;   return Undefined&lt;br /&gt; else&lt;br /&gt;   return Defined( n/d )&lt;/pre&gt;&lt;br /&gt;So now, any code that calls divide must deconstruct the return value of divide into either a Defined result x, or an Undefined result indicating an invalid input:&lt;br /&gt;&lt;pre&gt;fun half(x) =&lt;br /&gt;  match divide(x,2) with&lt;br /&gt;    Undefined -&gt; print "divide by zero!"&lt;br /&gt;  | Defined(y) -&gt; print "x/2=" + y&lt;/pre&gt;&lt;br /&gt;As you can see, a total divide function &lt;em&gt;forces&lt;/em&gt; you to handle all possible cases, and so any program using it will never have undefined behaviour when dividing by zero. It will have whatever behaviour you specify on the Undefined branch. No efficiency is lost as any decent compiler will inline the above code, so the cost is just an integer compare against 0.&lt;br /&gt;&lt;br /&gt;A similar strategy can be used for errors that come from outside the program as well. Consider the file open() function. It's a partial function:&lt;br /&gt;&lt;pre&gt;fun open(name) : string -&gt; FILE&lt;/pre&gt;&lt;br /&gt;It can be transformed it into a total function as follows:&lt;br /&gt;&lt;pre&gt;data FileOpenResult = File f | PermissionDenied | FileNotFound | ...&lt;br /&gt;fun open(name) : string -&gt; FileOpenResult&lt;/pre&gt;&lt;br /&gt;If all partial functions are made total using the above technique, then all of your software will have fully defined behaviour. No more surprises! To a certain extent, totality even mitigates some problems with non-determinism: after all, who cares if the output is not reproducible since you are forced to handle every possible case anyway.&lt;br /&gt;&lt;br /&gt;At first glance, totality is not appealing to the lazy programmer. But if you've ever had to develop and maintain complex software, you'll soon appreciate your language forcing you to deal with all possible cases at development time, instead of dealing with irate customers after deployment.&lt;br /&gt;&lt;br /&gt;For all the lazy programmers out there, consider this: totality almost eliminates the need for testing. As a lazy programmer myself, that's a selling point I can support. ;-)&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;The Benefits of Strong Static Typing&lt;/h2&gt;&lt;br /&gt;What is strong static typing? Static typing is an analysis a compiler performs to ensure that all the uses of data and functions in a program are consistent. In other words, you're not saying X in one place, and then saying Y which contradicts X in another place.&lt;br /&gt;&lt;br /&gt;I take &lt;em&gt;strong&lt;/em&gt; static typing to mean a static type analysis that the developer &lt;em&gt;cannot circumvent&lt;/em&gt;. C/C++ are statically typed, but their static type systems are not strong because you can freely cast between types, even when doing so will generate an error. C#, Java, and similar languages are statically typed, but they are on the border line of strong typing: there is no way to defeat the type system, but the static analysis is weak since you can still cast. Casting just introduces dynamic runtime checks and runtime errors.&lt;br /&gt;&lt;br /&gt;There are many arguments for and against strong static typing, but in combination with the previously discussed features, static typing enables developers to write software that is "almost correct by construction". By this I mean that if your program compiles, it will &lt;em&gt;run without errors&lt;/em&gt;, and it has a high probability of actually being &lt;em&gt;correct&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;The reason for this is some computer science black magic called &lt;a href="http://en.wikipedia.org/wiki/Curry-Howard_isomorphism"&gt;the Curry-Howard isomorphism&lt;/a&gt;. What that is exactly isn't important. Just consider a static type system to be a built-in logic machine: when you declare and use types this logic machine is making sure that everything you're telling it is logically consistent. If you give it a contradiction, it will produce a type error. Thus, if your whole program type checks, it is &lt;em&gt;logically consistent&lt;/em&gt;: it contains &lt;em&gt;no contradictions or logical errors&lt;/em&gt; in the statements the logic machine understands, and the logic machine constructed a &lt;em&gt;proof&lt;/em&gt; of this fact.&lt;br /&gt;&lt;br /&gt;The power of the static type system dictates how intelligent the logic machine is: the more powerful the type system, the more of your program the machine can test for consistency, and the closer your program is to being correct. Your program essentially becomes a logical specification of the problem you are solving. You can still make errors in the specification, but those errors must not contradict anything else in the specification for it to type check.&lt;br /&gt;&lt;br /&gt;The downside of strong static typing is that the analysis is necessarily &lt;em&gt;conservative&lt;/em&gt;, meaning some legitimate programs will produce a type error even though they would not generate an error at runtime.&lt;br /&gt;&lt;br /&gt;Such is the price to pay for the additional safety gained from static typing. As a lazy programmer I'm willing to pay that price.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Conclusion&lt;/h2&gt;&lt;br /&gt;These are some of the less common idioms found in advanced programming languages. Other idioms are either more well known and have already been adopted into other languages (algebraic datatypes, higher-order functions), or are simply less relevant for constructing reliable software (laziness).&lt;br /&gt;&lt;br /&gt;Absolute purity may ultimately prove to be unachievable, but its pursuit has given us a number of powerful tools which significantly aid in the development and maintenance of reliable, secure software.&lt;br /&gt;&lt;br /&gt;[Edit: clarified some statements based on &lt;a href="http://lambda-the-ultimate.org/node/2510"&gt;some feedback from LTU&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-7614874435939962730?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/7614874435939962730/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=7614874435939962730' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7614874435939962730'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7614874435939962730'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/10/on-importance-of-purity.html' title='On the Importance of Purity'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-932724835182463661</id><published>2007-10-08T15:29:00.000-04:00</published><updated>2007-10-08T15:51:27.115-04:00</updated><title type='text'>Reading PDFs on the iPhone the moderately difficult way</title><content type='html'>Taking a break from my programming language blogging, I thought I'd describe a recent adventure with my new iPhone: reading PDFs. The iPhone &lt;strong&gt;can&lt;/strong&gt; read PDFs, but insists on doing so only when reading the PDF from the network in some way, ie. via the mail client or Safari.&lt;br /&gt;&lt;br /&gt;Being the programming language enthusiast I am, I have plenty of papers stored locally on my machine which I'd like to read on the go, and network access is less than desirable for obvious reasons. Unfortunately, Apple disabled the obvious answer to local browsing: using the "file://" URI scheme. Very stupid of them IMO.&lt;br /&gt;&lt;br /&gt;Fortunately, I'm an "unscrupulous" person, because I jailbroke my iPhone so I could install third-party apps. So if network access is required to view PDFs in Safari, then I'll just have to access local files over the network! The way that's been done for over 20 years is available on the iPhone: a web server.&lt;br /&gt;&lt;br /&gt;Both Apache and LightTPD are available in Installer.app, and I chose the latter; I just find the configuration less obtuse than Apache's. You will need OpenSSH installed as well. I also recommend UICtl so you can load/unload the web server only when you need it.&lt;br /&gt;&lt;br /&gt;Once everything is installed, I ssh'd to the iPhone, opened the lighttpd config file at /usr/local/etc/lighttpd.conf, changed the root directory to point to /var/root/PDFs (or place it wherever you like), and added:&lt;br /&gt;&lt;code&gt;dir-listing.activate = "enable"&lt;/code&gt;&lt;br /&gt;config line to enable directory browsing. Then using UICtl, I unloaded and reloaded lighttpd.&lt;br /&gt;&lt;br /&gt;Finally, I opened up Safari on the iPhone and bookmarked http://127.0.0.1/&lt;br /&gt;&lt;br /&gt;Voila! I have access to all my local PDFs via Safari. :-)&lt;br /&gt;&lt;br /&gt;Browsing and viewing PDFs is very easy, unlike other schemes using the "data:" URI scheme. Of course, the setup is moderately difficult for anybody who isn't versed in the basics of unix and the command shell.&lt;br /&gt;&lt;br /&gt;Naturally, I'd much prefer a native app, or at the very least, local browsing via the file:// URI scheme in Safari. I'm keeping my fingers crossed. :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-932724835182463661?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/932724835182463661/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=932724835182463661' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/932724835182463661'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/932724835182463661'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/10/reading-pdfs-on-iphone-moderately.html' title='Reading PDFs on the iPhone the moderately difficult way'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-7921469817153773613</id><published>2007-09-28T21:35:00.000-04:00</published><updated>2011-09-26T01:44:23.080-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='pattern matching'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>Visitor Pattern Deprecated: First-Class Messages Are All You Need!</title><content type='html'>In previous posts [&lt;a href="http://higherlogics.blogspot.com/2007/04/visitor-pattern-considered-harmful.html"&gt;1&lt;/a&gt;,&lt;a href="http://higherlogics.blogspot.com/2007/04/general-pattern-matching-with-visitors.html"&gt;2&lt;/a&gt;], I argued that the visitor pattern was a verbose, object-oriented equivalent of a pattern matching function. The verbosity stems from the need to add the dispatching code to each data class in the hierarchy, and because the encapsulation inherent to objects is awkward when dealing with pure data.&lt;br /&gt;&lt;br /&gt;A single addition to an OOP language could completely do away with the need for the dispatching code and make OO pattern matching simple and concise: first-class messages (FCM). By this I mean, messages sent to an object are themselves 'objects' that can be passed around as parameters.&lt;br /&gt;&lt;br /&gt;To recap, functional languages reify the &lt;strong&gt;structure of data&lt;/strong&gt; (algebraic data types), and they abstract operations (functions). OOP languages reify &lt;strong&gt;operations&lt;/strong&gt; (interfaces), but they abstract the structure of data (encapsulation). They are duals of one another.&lt;br /&gt;&lt;br /&gt;All the Exp data classes created in [&lt;a href="http://higherlogics.blogspot.com/2007/04/visitor-pattern-considered-harmful.html"&gt;1&lt;/a&gt;,&lt;a href="http://higherlogics.blogspot.com/2007/04/general-pattern-matching-with-visitors.html"&gt;2&lt;/a&gt;] shouldn't be classes at all, they should be messages that are sent to the IVisitor object implementing the pattern matching:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;&lt;br /&gt;data Exp = App(e,e) | Int(i) | ...&lt;br /&gt;&lt;br /&gt;IVisitor iv = ...&lt;br /&gt;&lt;br /&gt;//the expression: 1 + 2&lt;br /&gt;Exp e = App(Plus(Int(1), Int(2)));&lt;br /&gt;&lt;br /&gt;//'e' is now the "App" operation placed in a variable&lt;br /&gt;iv.e  //send "App" to visitor&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;FCM removes the inefficient double-dispatch inherent to the visitor pattern, while retaining object encapsulation for when you really need it; the best of both worlds.&lt;br /&gt;&lt;br /&gt;Note also how the above message declaration looks a great deal like a variant in OCaml? This is because &lt;a href="http://www.cs.jhu.edu/~pari/papers/fool2004/first-class_FOOL2004.pdf"&gt;first-class messages are variants, and pattern-matching functions are objects&lt;/a&gt;. I'm actually implementing the DV language in that paper, first as an interpreter, then hopefully as a compiler for .NET. To be actually useful as a real language, DV will require some extensions though, so stay tuned. :-)&lt;br /&gt;&lt;br /&gt;[Edit: made some clarifications to avoid confusion]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-7921469817153773613?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/7921469817153773613/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=7921469817153773613' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7921469817153773613'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/7921469817153773613'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/09/visitor-pattern-deprecated-first-class.html' title='Visitor Pattern Deprecated: First-Class Messages Are All You Need!'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-6911216472639367489</id><published>2007-06-15T20:07:00.000-04:00</published><updated>2011-09-26T01:51:17.390-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='object oriented programming'/><title type='text'>Objects are Existential Packages</title><content type='html'>There's a long-running debate on the power of functional languages over object-oriented languages. In truth, now that C# has full generics, aka parametric polymorphism, it's almost equivalent in typing power to most typical statically typed functional languages. In fact, in terms of typing power, generic objects are universal and existential types, and can be used for all the fancy static typing trickery that those entail (see my previous reference to the &lt;a href="http://research.microsoft.com/~akenn/generics/gadtoop.pdf"&gt;GADTs in C#&lt;/a&gt; to see what I mean).&lt;br /&gt;&lt;br /&gt;As &lt;a href="http://higherlogics.blogspot.com/2007/05/gexl-lives.html"&gt;a prior post&lt;/a&gt; explained, C#'s type system still lacks some of the flexible constraint refinement available in more powerful functional type systems, but in general C# is powerful enough to encode most interesting functional abstractions.&lt;br /&gt;&lt;br /&gt;And I started a new project to demonstrate this: &lt;a href="http://sourceforge.net/projects/fpsharp"&gt;FP#&lt;/a&gt;. It provides a number of widely used functional abstractions, like the option type, a lazy type, lists, lazy lists, etc. and map, filter, and fold over all the collection types, including the standard .NET collections API. Each of these abstractions will be available as a separate DLL, so instead of linking to a large library, you can just pick those abstractions you're interested in using.&lt;br /&gt;&lt;br /&gt;Besides more flexible typing as in GADTs, expressiveness is the only advantage functional languages still have over C#. Compare the verbosity of the C# option type:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;//T is the type variable&lt;br /&gt;public abstract class Option&amp;lt;T&amp;gt; { }&lt;br /&gt;public sealed class None&amp;lt;T&amp;gt; : Option&amp;lt;T&amp;gt; { }&lt;br /&gt;public sealed class Some&amp;lt;T&amp;gt; : Option&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  T value;&lt;br /&gt;  public Some(T v) { value = v; }&lt;br /&gt;&lt;br /&gt;  public T Value&lt;br /&gt;  {&lt;br /&gt;    get { return value; }&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;as compared to an O'Caml definition:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;type 'a option = None | Some of 'a    (* 'a is the type variable *)&lt;/pre&gt;This contrast highlights &lt;a href="http://higherlogics.blogspot.com/2007/05/expressiveness-whats-all-fuss.html"&gt;my previous argument in favour of expressiveness&lt;/a&gt;; just think of it as 1 line of O'Caml generating 12 lines of C#, since the efficiency of both definitions is equivalent.&lt;br /&gt;&lt;br /&gt;To demonstrate the power of existential packages and universal types in C#, I'll be including a number of statically typed abstractions that have only been found in O'Caml and Haskell to date; types like statically sized lists and arrays, number-parameterized types, and other type wizardry resulting in strong partial correctness properties (see: &lt;a href="http://lambda-the-ultimate.org/node/1635"&gt;Lightweight Static Capabilities&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;It will be particularly interesting to compare the efficiency of a sized type, like a list, to its unsized counterpart, because .NET does not erase types like Java and O'Caml do.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-6911216472639367489?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/6911216472639367489/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=6911216472639367489' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6911216472639367489'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/6911216472639367489'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/06/objects-are-existential-packages.html' title='Objects are Existential Packages'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-8607338501423739821</id><published>2007-05-28T12:16:00.000-04:00</published><updated>2011-09-26T02:07:55.517-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='Ruby'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><title type='text'>Expressiveness: what's all the fuss?</title><content type='html'>I just &lt;a href="http://www.hanselman.com/blog/ProgrammerIntentOrWhatYoureNotGettingAboutRubyAndWhyItsTheTits.aspx"&gt;read a blog post&lt;/a&gt; recommending that developers broaden their language horizons, with a particular emphasis on Ruby. The author attempts to explain why expressiveness is an important metric for a language:&lt;br /&gt;&lt;blockquote&gt;&lt;blockquote&gt;What is this obssesion[sic] with "expressiveness"? Go write poertry [sic] if you want to be expressvive.[sic] &lt;/blockquote&gt;Remember that ultimately our jobs are (usually) to solve some kind of business problem. We're aiming for a finish line, a goal. The programmer's job is translate the language of the business person to the language of the computer.&lt;br /&gt;&lt;br /&gt;The whole point of compilers, interpreters, layers of abstraction and what-not are to shorten the semantic distance between our intent and the way the computer thinks of things.&lt;/blockquote&gt;&lt;br /&gt;To be honest, this is not very convincing; the moment you mention "semantics", is the moment many developers will close your blog and go do something "productive".&lt;br /&gt;&lt;br /&gt;The argument for expressiveness is ultimately quite simple: the more expressive the language, the more you can do in fewer lines of code. This means that 3 lines of Ruby code might take 12 lines in C#, and 15 lines of C# could be compressed to 2 lines of Ruby &lt;span style="font-weight: bold;"&gt;while retaining readability and maintainability&lt;/span&gt; [1].&lt;br /&gt;&lt;br /&gt;Consider viewing Ruby as a C# code generator, where 2 lines of Ruby code can &lt;span style="font-weight:bold;"&gt;generate&lt;/span&gt; 12 lines of C#. That actually sounds like a pretty good idea doesn't it? You would never write a generator to go the other way around though would you?&lt;br /&gt;&lt;br /&gt;More expressive languages also tend to be simpler and more coherent. There are all sorts of little ad-hoc rules to Java and C# that you would not find in most functional languages.&lt;br /&gt;&lt;br /&gt;You can readily see the differences in expressiveness at the &lt;a href="http://shootout.alioth.debian.org/"&gt;Alioth language shootout&lt;/a&gt;. Set the metrics to CPU:0, memory:0, GZIP:1, which means we only care about the GZIP'd size of the source code. You'll see that functional languages tend to come out on top. Ironically, Ruby is first lending weight to the above blog post on Ruby's expressiveness.&lt;br /&gt;&lt;br /&gt;Expressiveness is the whole driving force behind DSLs: a DSL is more expressive in solving the problems it was designed for. For instance, a relational database DSL specific to managing blogs could generate perhaps 100 lines of SQL per single line of DSL code. It would take you far more code to emulate that 1 DSL line in C#.&lt;br /&gt;&lt;br /&gt;So the expressiveness argument is quite compelling if you simply view it as a code generation metric: the more expressive the language, the more code it can generate for you for the same number of characters typed. That means you do your job faster, and more importantly, with fewer errors.&lt;br /&gt;&lt;br /&gt;Why fewer errors? Studies have shown that developers generally write about 20-30 correct lines of code per day. That means 20 lines of correct C# code or 20 lines of correct Ruby code. It just so happens that those 20 lines of correct Ruby can generate 100 lines of correct C#, which means you're now 5x more productive than your C#-only developer next door. Do you see the advantages now?&lt;br /&gt;&lt;br /&gt;[1] These numbers are completely bogus and are used purely for illustrative purposes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-8607338501423739821?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/8607338501423739821/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=8607338501423739821' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8607338501423739821'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/8607338501423739821'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/05/expressiveness-whats-all-fuss.html' title='Expressiveness: what&apos;s all the fuss?'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-1454336250830110853</id><published>2007-05-09T21:56:00.000-04:00</published><updated>2011-09-26T01:40:43.964-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='pattern matching'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='expression problem'/><title type='text'>GEXL Lives!! Solving the Expression Problem in C#</title><content type='html'>&lt;p&gt;&lt;a href="http://sourceforge.net/projects/gexl"&gt;GEXL&lt;/a&gt; is a general expression library providing core primitives for building and processing expression trees. I had abandoned it awhile ago because I quickly ran into a serious, well-known issue: &lt;a href="http://www.daimi.au.dk/%7Emadst/tool/papers/expression.txt"&gt;The Expression Problem&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Basically, the expression problem boils down to the fact that typical programming solutions are extensible in only a single direction: either the data types or the operations defined on them, but not both, can be extended without modifying existing code. It takes some sophisticated type machinery to safely solve the expression problem, and various languages, including &lt;a href="http://lambda-the-ultimate.org/node/1518#comment-17566"&gt;OCaml&lt;/a&gt;, &lt;a href="http://lambda-the-ultimate.org/node/1453"&gt;Haskell&lt;/a&gt;, &lt;a href="http://lambda-the-ultimate.org/node/2231"&gt;Beta/gBeta&lt;/a&gt;, and &lt;a href="http://lambda-the-ultimate.org/node/1551#comment-18495"&gt;Scala&lt;/a&gt;, have acceptable solutions. Languages supporting multimethods support &lt;a href="http://lambda-the-ultimate.org/node/2232#comment-31083"&gt;less type safe extension&lt;/a&gt;, so I exclude them here (although &lt;a href="http://nice.sourceforge.net/"&gt;Nice&lt;/a&gt; might do it properly).&lt;/p&gt;&lt;p&gt;So GEXL, which aims to provide both data and their operations as data and visitors respectively, runs smack into the expression problem. Fortunately, Mads Torgersen's paper, &lt;a href="http://lambda-the-ultimate.org/node/2232"&gt;The Expression Problem Revisited&lt;/a&gt;, provides 4 new solutions by exploiting the generics of C# and Java. The first 3 solutions are still biased towards data-centric or operation-centric extensions, but the last solution provides full extensibility in both directions in ordinary C# with generics. GEXL is saved! There's just one catch: it's not type safe since it requires 2 casts. Fortunately, these casts are encapsulated in the core data and visitor classes, which are parameterized by the types they must check and cast to. The developer extending the core will also need to be a bit careful to make sure he's parameterizing the visitor and expression types at the same level as the extension he's writing.&lt;br /&gt;&lt;/p&gt;&lt;p&gt;These core classes are &lt;a href="http://gexl.svn.sourceforge.net/viewvc/gexl/trunk/"&gt;now implemented in GEXL&lt;/a&gt;, together with two interpreters and pretty printers written for the untyped lambda calculus, one implemented as &lt;a href="http://gexl.svn.sourceforge.net/viewvc/gexl/trunk/LambdaCalculus/LambdaCalculus/DataCentric.cs?view=log"&gt;data-centric extension&lt;/a&gt; of the core library, and the other as an &lt;a href="http://gexl.svn.sourceforge.net/viewvc/gexl/trunk/LambdaCalculus/LambdaCalculus/OpCentric.cs?view=log"&gt;operation-centric extension&lt;/a&gt;; the two total about 300 lines of excessively verbose C#, including comments. The paper provides a mixed example of data and operation extensions if you're interested in extending in both directions.&lt;/p&gt;&lt;p&gt;There is a disadvantage of this solution however: visitors can &lt;a href="http://gexl.svn.sourceforge.net/viewvc/gexl/trunk/IVisitor.cs?view=markup"&gt;no longer return the results of their computation&lt;/a&gt;, due to a type dependency between the &lt;a href="http://gexl.svn.sourceforge.net/viewvc/gexl/trunk/Data.cs?view=markup"&gt;Data&lt;/a&gt; and &lt;a href="http://gexl.svn.sourceforge.net/viewvc/gexl/trunk/Op.cs?view=markup"&gt;Op&lt;/a&gt;; if we were to parameterize IVisitor with a return type, IVisitor&amp;lt;R&amp;gt;, &lt;a href="http://higherlogics.blogspot.com/2007/04/general-pattern-matching-with-visitors.html"&gt;as I described earlier&lt;/a&gt;, Op must also be parameterized by R, which forces Data to &lt;span style="font-weight: bold;"&gt;also&lt;/span&gt; be parameterized by R since Data's type is constrained by IVisitor&lt;r&gt;.&lt;/r&gt;&lt;/p&gt;&lt;p&gt;&lt;r&gt;This means that the entire expression tree will be parameterized by the return type of the operation that we wish to perform, R, which means it can only be traversed by operations that return R! No more typechecking returning bools and tree transforms returning new trees on the same expression tree. Type safety has backed us into a corner; this is in fact a GADT problem, and &lt;a href="http://higherlogics.blogspot.com/2007/04/general-pattern-matching-with-visitors.html"&gt;as covered previously&lt;/a&gt;, C# generics fall just short of GADTs. If C# were to be &lt;a href="http://lambda-the-ultimate.org/node/1134"&gt;augmented by full equational constraints&lt;/a&gt;, instead of the limited form they have now, this problem would be solvable, since we could simply parameterize the methods themselves with the appropriate constraints.&lt;/r&gt;&lt;/p&gt;&lt;p&gt;&lt;r&gt;In fact, full equational constraints aren't needed; there are only two limitations of C#'s type system impeding this solution:&lt;/r&gt;&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;r&gt;Limitations of type constraints to derivations: a constraint placed on a generic parameter must be a non-sealed class or an interface, so a constraint limiting a type parameter to a string is not possible. This should be relaxed so that the type constraint is a full subtype relation. The current design is simply a consequence of #2.&lt;br /&gt;&lt;/r&gt;&lt;/li&gt;&lt;li&gt;Lack of type constraint refinements: a constraint cannot be refined to a more specific type when subtyping. For instance, the declaration "class Foo&lt;t&gt; where T : S", cannot be refined to "class Bar&lt;t&gt; : Foo&lt;t&gt; where T : U" where U is a subtype of S.&lt;br /&gt;&lt;/t&gt;&lt;/t&gt;&lt;/t&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;As it stands, the least offensive solution is for each visitor to retain its computed value as a private field.&lt;/p&gt;&lt;p&gt;So now that GEXL lives, perhaps we'll also see the eventual revival of &lt;a href="http://sourceforge.net/projects/orc-dotnet"&gt;Orc.NET&lt;/a&gt;. In the meantime, I'll try porting the interpreter for my language to GEXL and see how painful it turns out to be. :-)&lt;br /&gt;&lt;/p&gt;&lt;p&gt;P.S. There's a further &lt;a href="http://lambda-the-ultimate.org/node/2232#comment-31278"&gt;interesting post&lt;/a&gt; in the above LTU thread: it provides a solution to the expression problem using just sum and recursive types; it requires significant boilerplate, but the extensions are still possible, which is impressive.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-1454336250830110853?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/1454336250830110853/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=1454336250830110853' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1454336250830110853'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/1454336250830110853'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/05/gexl-lives.html' title='GEXL Lives!! Solving the Expression Problem in C#'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-2366242379041836776</id><published>2007-05-03T22:55:00.000-04:00</published><updated>2011-09-26T01:52:31.585-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtual machines'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><category scheme='http://www.blogger.com/atom/ns#' term='concurrency'/><category scheme='http://www.blogger.com/atom/ns#' term='CLR'/><title type='text'>What's wrong with .NET</title><content type='html'>&lt;p&gt;I work with .NET everyday, and it's a decent virtual machine from the developer's perspective, and a slight overall improvement over Java in a number of respects.&lt;/p&gt;&lt;p&gt;But for a language implementor, .NET continues a tradition of mistakes that began with Java, which inhibit it from achieving its much touted goal as the Common Language Runtime (CLR). These issues are divided into two categories: blatant mistakes, and nice to haves. The former requires no cutting edge features or research to implement, and the CLR should have had them from the get-go. The latter might have required some research to get right (like they did with generics), but which are ultimately required for a truly secure and flexible virtual machine.&lt;/p&gt;&lt;h3&gt;Blatant Mistakes&lt;/h3&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Insufficiently powerful primitives for execution and control flow.&lt;/span&gt; The most glaring of these omissions are first class function pointers and verifiable indirect calls. On .NET a function pointer is reified as an IntPtr, which is an opaque struct representing a native pointer reified as a native integer. calli is the VM instruction that indirectly invokes a function via such an IntPtr. Because an IntPtr is opaque, the VM can't really know whether it's pointing to a function, or a data blob held outside the VM heap, and calli is consequently an unverifiable instruction. calli is thus available only under full trust since the VM will crash if it calli's into a data blob. Many truly interesting languages would require calli for efficient execution, and the full trust requirement means that partial trust environments, such as cheap web hosts, won't be able to run assemblies generated from these languages. Microsoft consulted with a number of language researchers in the design for .NET and they responded enthusiastically with a "verifiable calli" proposal for just the above reasons. Unfortunately, MS ignored them.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Stack walking was a huge mistake inherited from Java.&lt;/span&gt; Stack walking prevents a number of useful optimizations, such as efficient tail calls, and alternative security architectures and execution models from being implemented. As it is, we're stuck with the horrible Code Access Security (CAS), when a far simpler model was already built into the VM. MS's new &lt;a href="http://silverlight.net/"&gt;Silverlight&lt;/a&gt; supposedly introduces a new security framework based solely on named "trusted" assemblies, so perhaps we can expect this to be partly fixed. Efficient tail calls in particular are essential for functional languages.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Continuations.&lt;/span&gt; I was never sold on continuations until I started considering scheduling and scalability issues, and subsequently read a number of papers on concurrency; in summary, continuations are essential for scalable, schedulable execution subject to configurable user-level policies. Actually, continuations are not strictly necessary, as an expensive global CPS-transform of a program coupled with efficient tail calls can achieve the same effect; however, CPS has a number of performance side-effects that may not be acceptable for certain languages. Thus, continuations or something equally expressive, like delimited continuations, are essential for a true CLR. Coupled with efficient tail calls, we can easily achieve highly scalability architectures with low memory use, as &lt;a href="http://www.eecs.harvard.edu/%7Emdw/proj/seda/"&gt;SEDA&lt;/a&gt; and &lt;a href="http://erlang.org/"&gt;Erlang&lt;/a&gt; have shown. Many people believe that threads subsume continuations, and some believe the contrary, but in fact, both abstractions are necessary; a thread is a "virtual processor", and a continuation is the processor context. If .NET had continuations, we could build operating system-like execution models that, in theory, can achieve optimal CPU utilization and parallelism with minimal resource use via non-blocking I/O, dataflow variables, and a number of threads equal to the number of real CPUs on the machine. This is achievable only with continuations or the equivalent as a global CPS transform.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Poor security architecture.&lt;/span&gt; CAS is a hack bolted on to a VM that had it omitted a single feature and tweaked another feature, wouldn't have needed any security architecture at all. If static class fields were not available, then the native execution model of the .NET VM reduces to &lt;a href="http://www.erights.org/elib/capability/overview.html"&gt;capability-based security&lt;/a&gt;. It's possible to include static fields if the fields have certain properties (transitive immutability for instance), but conceptually, just imagine a VM whereby the only way to obtain access to an object is being given access to it such as in a method call. There is a further condition to satisfy capability security that has consequences for P/Invoke: all functionality accessible via P/Invoke must be reified as a reference (ie. an object). This means that static, globally accessible operations such as "FileInfo File.Open(string name)" would not be permitted since they subvert the security model by calling out to the OS to convert a string into authority to a File; instead, at the very least File.Open would be reified as a FileOpener object, and this object is "closely held" by trusted code. In effect, this forces all P/Invoke functions to be minimally capability secure. There is also an alternative approach to security with which we can have our fully mutable static fields, and have our isolation and security too; in fact, it's a widely used abstraction invented decades ago which provides full fault isolation and a number of other useful properties: lightweight language process; like OS processes they provide full isolation, but it's a process that exists only in the VM; in effect, it's similar to processes as found in the pi-calculus. Static fields are thus scoped to the lightweight process. Processes and their properties are explained further under "nice to haves".&lt;/li&gt;&lt;/ol&gt;&lt;h3&gt;Nice to Haves&lt;/h3&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Lightweight process model.&lt;/span&gt; Yes, I'm aware that .NET has the AppDomain which is sort of like a lightweight process, but it's isolation properties are inadequate. &lt;a href="http://research.microsoft.com/os/singularity/"&gt;Microsoft's Singularity operating system&lt;/a&gt; takes the base AppDomain abstraction, and extends it into a full "software isolated process" (SIP). Processes should be single-threaded, and should additionally permit fault handling and resumption via &lt;a href="http://cap-lore.com/CapTheory/KK/Keeper.html"&gt;Keepers&lt;/a&gt;. Like in OSs, portions of a process can potentially be paged in and out, serialized, persisted, or otherwise manipulated, all transparently. Call them &lt;a href="http://www.erights.org/"&gt;Vats as in E&lt;/a&gt;, processes as in Erlang, or SIPs as in Singularity, but the process abstraction is critical for full VM isolation. AppDomains are fairly good start however, and I'm researching ways to exploit AppDomains to enhance .NET's security.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Resource accounting&lt;/span&gt;. Resource accounting includes managing memory, scheduling concurrent execution, etc. Without properly reifying these abstractions as explicit objects subject to configurable policies, the VM is vulnerable to denial of service (DoS) attacks from malicious code. As it stands, .NET cannot load and run potentially malicious code in the same VM as benign code. For instance, it should be possible to set a quota for memory use, and to schedule and interleave execution so that code cannot exhaust VM memory and "steal the CPU". However, you can imagine that passing around quotas in code would be quite unwieldy; fortunately, this requirement interacts nicely with another abstraction we already need: processes. Thus, the process is also the unit of resource accounting, and spawning a new process starts a new heap, potentially with a quota. Processes can then be scheduled via a VM-level scheduler, and each VM can also schedule its own execution as it sees fit (threaded, event-loops, etc.). Interprocess communication (IPC) can be managed with an exchange heap as in Singularity, or via copying as is done in most microkernels.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Concurrency model&lt;/span&gt;. This has more of an effect on the class library than the base VM design, but &lt;a href="http://www.erights.org/elib/concurrency/event-loop.html"&gt;plan interference&lt;/a&gt; is a huge source of bugs in concurrent code, and it should be tackled by every language. VM processes provide a minimal, tried and tested concurrency and isolation abstraction, but within a process there are a wide range of options available, including event-loop concurrency, or language enforced concurrency models. Controlled sharing of state between processes needs to be tackled, such as the exchange heap in Singularity, though my current inclination is to prefer copying for safety reasons.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;As you can see, .NET doesn't need much to turn it into a flexible, industrial strength, high security VM. I'm going to follow this post with a detailed description of what the ideal minimal VM should have, and I will even describe how I believe it can be implemented safely and efficiently.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-2366242379041836776?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/2366242379041836776/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=2366242379041836776' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2366242379041836776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/2366242379041836776'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/05/whats-wrong-with-net.html' title='What&apos;s wrong with .NET'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-9170791310660296139</id><published>2007-04-07T14:10:00.000-04:00</published><updated>2011-09-26T01:51:32.280-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='type theory'/><category scheme='http://www.blogger.com/atom/ns#' term='pattern matching'/><category scheme='http://www.blogger.com/atom/ns#' term='object oriented programming'/><title type='text'>General Pattern Matching with Visitors</title><content type='html'>My previous definition of IVisitor hard-coded the return value as a type of Val, but one can generalize a visitor to any kind pattern-matching function by adding a generic type parameter to the interface:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;//the pattern matching visitor&lt;br /&gt;public interface IVisitor&amp;lt;T&amp;gt;&lt;br /&gt;{&lt;br /&gt;  //return type is now parameterized&lt;br /&gt;  T App(Exp e0, Exp e1);&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;This of course requires a modification to the class variants as well:&lt;br /&gt;&lt;pre class="brush: csharp"&gt;//the type representing an expression application&lt;br /&gt;public sealed class App : Exp&lt;br /&gt;{&lt;br /&gt;  Exp e0;&lt;br /&gt;  Exp e1;&lt;br /&gt;&lt;br /&gt;  public App(Exp e0, Exp e1)&lt;br /&gt;  {&lt;br /&gt;      this.e0 = e0;&lt;br /&gt;      this.e1 = e1;&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  //add a generic constraint to the Visit method, so the client can&lt;br /&gt;  //specify the return type&lt;br /&gt;&lt;br /&gt;  public override T Visit&amp;lt;T&amp;gt;(IVisitor&amp;lt;T&amp;gt; v)&lt;br /&gt;  {&lt;br /&gt;      return v.App(e0, e1);&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;So instead of IVisitor being hard-coded as an evaluator, I can now write an IVisitor that performs transformations on the expression tree (such as rewriting optimizations), or a visitor that implements some arbitrary predicate over the expression tree (such as type-checking, type inference, etc.).&lt;br /&gt;&lt;br /&gt;Like pattern matching functions over ordinary parameterized algebraic data types, the return value of the visitor must be T or a subtype of T.&lt;br /&gt;&lt;br /&gt;In fact, this visitor pattern, minus the decomposition as in pattern matching, is featured in &lt;a href="http://research.microsoft.com/~akenn/generics/gadtoop.pdf"&gt;Generalized Algebraic Data Types and Object-Oriented Programming&lt;/a&gt;; see also the &lt;a href="http://lambda-the-ultimate.org/node/1134"&gt;LTU discussion&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;According to the paper, generics as in C# are almost capable of encoding GADTs. Unfortunately, they lack some flexibility in declaring type dependencies between class type parameters, and method type parameters. The paper provides some very interesting C# examples of type-safe type checking, evaluation, phantom types, and representation types. This generics black magic warrants further study for the advanced C# student.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-9170791310660296139?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/9170791310660296139/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=9170791310660296139' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/9170791310660296139'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/9170791310660296139'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/04/general-pattern-matching-with-visitors.html' title='General Pattern Matching with Visitors'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2744072865491516720.post-786228734065190818</id><published>2007-04-03T21:38:00.000-04:00</published><updated>2011-09-26T01:45:52.517-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='pattern matching'/><category scheme='http://www.blogger.com/atom/ns#' term='object oriented programming'/><title type='text'>Visitor pattern considered harmful (unless functionalized)</title><content type='html'>To kick off my first post, I thought I'd cover a post on another &lt;a href="http://etymon.blogspot.com/2006/04/visitor-pattern-and-trees-considered.html"&gt;blog that I found interesting&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Basically, the gist of the post is that object oriented visitor pattern does not lend itself to the problem of walking trees, and general symbolic manipulation, despite that being its claim to fame. In general, algebraic data types and pattern matching are much simpler, and more concise.&lt;br /&gt;&lt;br /&gt;I agree, however I think that post goes a bit too far in saying that visitors simply can't do this elegantly, because in fact there's a straightforward translation of a pattern matching function into a visitor, as long as you keep your objects "functional", ie. treat them as variants.&lt;br /&gt;&lt;br /&gt;For instance, let's say we are constructing an evaluator:&lt;br /&gt;&lt;br /&gt;&lt;pre class="brush: csharp"&gt;//the Expression type&lt;br /&gt;public abstract class Exp&lt;br /&gt;{&lt;br /&gt;   public abstract Val Visit(IVisitor v);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;//the type of Values&lt;br /&gt;public abstract class Val : Exp&lt;br /&gt;{&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;//the type representing an application&lt;br /&gt;public sealed class App : Exp&lt;br /&gt;{&lt;br /&gt;   Exp e0;&lt;br /&gt;   Exp e1;&lt;br /&gt;&lt;br /&gt;   public App(Exp e0, Exp e1)&lt;br /&gt;   {&lt;br /&gt;       this.e0 = e0;&lt;br /&gt;       this.e1 = e1;&lt;br /&gt;   }&lt;br /&gt;&lt;br /&gt;   public override Val Visit(IVisitor v)&lt;br /&gt;   {&lt;br /&gt;       return v.App(e0, e1);&lt;br /&gt;   }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;//the pattern matching visitor&lt;br /&gt;public interface IVisitor&lt;br /&gt;{&lt;br /&gt; Val App(Exp e0, Exp e1);&lt;br /&gt; //...&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This is slightly different from the typical visitor pattern you see in books, where the object itself would be passed in to IVisitor.App(). In fact, this looks almost exactly like a pattern matching function in a language like OCaml:&lt;br /&gt;&lt;br /&gt;&lt;pre class="brush: csharp"&gt;type Exp = App of Exp * Exp | Val   (* | ... *)&lt;br /&gt;&lt;br /&gt;let rec eval exp =&lt;br /&gt;  match exp with&lt;br /&gt;   App e0 e1 -&gt;   (* ... evaluate the application *);;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Thus, properly designed, a visitor is simply a pattern matching function. By convention, we use the class name as the visitor's method, so it matches with the convention in functional language's (ie. the constructor name). The visitor method's parameters are the object's private fields, just like a variant. I'm using this approach for an interpreter I'm implementing and it works quite well (unfortunately no source is publicly available at the moment).&lt;br /&gt;&lt;br /&gt;In terms of efficiency, the pattern matching function is probably faster. After all, it's simply a tag check plus a branch, whereas the visitor must perform a double dispatch (two method lookups incurring two indirections and two indirect branches).&lt;br /&gt;&lt;br /&gt;Clearly, the visitor is also far more verbose, as you must create the "variant classes", manually add a "Visit" method to each, and since there is no "default" case as in pattern matching, like the "default" case in a switch statement, you have to exhaustively handle each variant explicitly in each visitor, even if the case isn't relevant. You can mitigate the latter problem by implementing a "Default" abstract base class with dummy implementations on all cases, and then override only the cases you are interested in:&lt;br /&gt;&lt;br /&gt;&lt;pre class="brush: csharp"&gt;public abstract class Default : IVisitor&lt;br /&gt;{&lt;br /&gt; //overridden in my Interpreter visitor&lt;br /&gt; public virtual Val App(Exp e0, Exp e1)&lt;br /&gt; {&lt;br /&gt;   throw new NotSupportedException();&lt;br /&gt; }&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;As a potentially interesting extension, the variant classes may not even have to be in the same inheritance hierarchy, they only have to implement a marker &lt;span style="font-weight: bold;"&gt;interface&lt;/span&gt; with the appropriate Visit() method. This may increase extensibility along similar lines to polymorphic variants; only the visitor has to be kept up to date with all the relevant cases, and it can evolve separately from the core.&lt;br /&gt;&lt;br /&gt;Summary:&lt;br /&gt;&lt;br /&gt;So if you're doing symbolic manipulation, and are forced to operate in a OO language like C# above, you can still approximate the elegance of algebraic data types and pattern matching, it just takes a little more work on your part. Pretty much par for the course in the OO world. :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2744072865491516720-786228734065190818?l=higherlogics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://higherlogics.blogspot.com/feeds/786228734065190818/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2744072865491516720&amp;postID=786228734065190818' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/786228734065190818'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2744072865491516720/posts/default/786228734065190818'/><link rel='alternate' type='text/html' href='http://higherlogics.blogspot.com/2007/04/visitor-pattern-considered-harmful.html' title='Visitor pattern considered harmful (unless functionalized)'/><author><name>Sandro Magi</name><uri>https://profiles.google.com/104695796131521685857</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-dA5Kfd0V1eA/AAAAAAAAAAI/AAAAAAAAHDI/dOX3uTBge-g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry></feed>
