Wednesday, November 18, 2009

Easy File System Path Manipulation in C#

I came across this type-safe module for handling file paths in the Haskell subreddit this week, and thought it looked kind of neat. Handling paths as strings, even with System.IO.Path always bugged me.

So I created a close C# equivalent of the Haskell type and added it to the Sasa library, to be available in the upcoming v0.9.3 release:
public struct FsPath : IEnumerable<string>, IEnumerable
{
public FsPath(IEnumerable<string> parts);
public FsPath(string path);

public static FsPath operator /(FsPath path1, FsPath path2);
public static FsPath operator /(FsPath path, IEnumerable<string> parts);
public static FsPath operator /(FsPath path, string part);
public static FsPath operator /(FsPath path, string[] parts);
public static FsPath operator /(IEnumerable<string> parts, FsPath path);
public static FsPath operator /(string part, FsPath path);
public static FsPath operator /(string[] parts, FsPath path);
public static implicit operator FsPath(string path);

public FsPath Combine(FsPath file);
public IEnumerator<string> GetEnumerator();
public static FsPath Root(string root);
public override string ToString();
}

The above is just the interface, minus the comments which you can see at the above svn link. This struct tracks path components for you and C#'s operator overloading lets you specify paths declaratively without worrying about combining paths with proper separator characters, etc.:
FsPath root = "/foo/bar";
var baz = root / "blam" / "baz";
var etc = FsPath.Root("/etc/");
var passwd = etc / "passwd";

The library will generate path strings using the platform's directory separator.

One significant departure from the Haskell library is the lack of phantom types used to distinguish the various combinations of relative/absolute and file/directory paths. C# can express these same constraints, but I intentionally left them out for two reasons:
  1. The type safety provided by the file/directory phantom type is rather limited, since it's only an alleged file/directory path; you have to consult the file system to determine whether the alleged type is actually true. The only minor advantage to specifying this in a path type, is as a form of documentation to users of your library that you expect a file path in a particular method, and not a directory path. I could be convinced to add it for this reason.

  2. The relative/absolute phantom type seems a bit useless to me, though the reason may be less obvious to others. IMO, the set of file and directory info classes should not have GetParent() or support ".." directory navigation operations, nor should they support static privilege escalation operations like File.GetAbsolutePath() which ambiently converts a string into an authority on a file/directory, since this inhibits subsystem isolation; without GetParent() or privilege escalation functions, any subsystem you hand a directory object is automatically chroot jailed to operate only in that directory. This has long been known to the capability-security community, and is how they structure all of their IO libraries (see Joe-E and the E programming language). Thus, in a capability-secure file system library, all paths are relative paths interpreted relative to a given directory object. As a result, FsPath also strips all "." and resolve all ".." to within the provided path to supply this isolation.

It goes without saying that .NET does not currently handle files and directories this way, but I plan to handle this eventually as well. FsPath will play a significant role in ensuring that paths cannot escape the chroot jail.

As an interim step along that path, the FsPath interface is a good first step.

[Edit: the reddit thread for this post has some good discussion.]

No comments: