Skip to main content

Sasa.IO.FilePath - Easy and Safe Path Manipulations

This is the fifteenth post in my ongoing series covering the abstractions in Sasa. Previous posts:

One persistent difficulty in dealing with IO in .NET is path handling. .NET exposes a platform's directory separator characters, but really this sort of thing should be automated. Furthermore, paths are considered simple strings and so concatenating fragments could leave you with a path string that goes up and down directories with no clear final result, ie. resolving the final path string is left to the OS.

This means that you can't easily reason about the constructed paths without consulting the OS, which is a relatively expensive operation. Furthermore, Path.Combine has a number of corner cases that make constructing paths non-compositional, requiring numerous argument validations to ensure a correct result.

Enter Sasa.IO.FilePath. It's a simple struct that encapsulates an underlying path string, so FilePath operations are just as efficient as your current path handling. FilePath fully resolves directory change operations where possible, so the final path string designates the path the OS will actually look up. Null or empty strings are considered references to the current directory, ie. ".". There is also a "jail" operation which ensures that a provided path cannot escape a particular sub-directory, just like the OS-level chroot jail.

I first posted about this abstraction back in 2009, but the name and implementation has changed a little since then.

Sasa.IO.FilePath Constructor

The constructor takes an arbitrary path string, resolves all the inner directory change operations, and returns a final path:

var foo = "foo/../..";
var path1 = new FilePath(foo);
var path2 = new FilePath("bar/" + foo);
Console.WriteLine(path1);
Console.WriteLine(path2);
Console.WriteLine("root" / path2); // FilePath composition using /
// output:
// ..
// .
// root

Sasa.IO.FilePath.Combine

The Combine method exposes the same semantics as System.IO.Path.Combine, but on FilePath instances:

var foo = new FilePath("foo/../..");
var bar = new FilePath("/bar");
Console.WriteLine(FilePath.Combine(foo, bar));
Console.WriteLine(FilePath.Combine(bar, foo));
// output:
// ..\bar
// .

Sasa.IO.FilePath.IsParentOf

The IsParentOf method compares two paths to see if one is a subset of the other:

var foobar = new FilePath("foo/bar");
var bar = new FilePath("bar");
var foo = new FilePath("foo");
Console.WriteLine(bar.IsParentOf(foobar));
Console.WriteLine(foo.IsParentOf(foobar));
Console.WriteLine(foobar.IsParentOf(foo));
// output:
// false
// true
// false

Sasa.IO.FilePath.Jail

The Jail methods provide a chroot-jail operation on file paths, to ensure a provided path string cannot escape a certain sub-directory:

var escape = new FilePath("../..");
var bar = new FilePath("/bar");
Console.WriteLine(FilePath.Jail(bar, escape));
// output:
// bar

Sasa.IO.FilePath.Resolve

The real workhorse behind FilePath, Sasa.IO.FilePath.Resolve resolves all path change operations in a string and returns a simplified path string. In case you want to stick with raw string paths, you can use this method to perform path simplification:

var path1 = "foo/../..";
var path2 = "bar/" + path1;
Console.WriteLine(FilePath.Resolve(path1));
Console.WriteLine(FilePath.Resolve(path2));
Console.WriteLine(FilePath.Resolve("root/" + path2));
// output:
// ..
// 
// root

Sasa.IO.FilePath./ Operator

The division operator on file paths is a convenient shorthand for composing paths, and a shorthand for FilePath.Combine:

var foo = "foo/../..";
var path1 = new FilePath(foo);
var path2 = "bar" / path1;
Console.WriteLine(path1);
Console.WriteLine(path2);
Console.WriteLine("root" / path2);
// output:
// ..
// .
// root

Sasa.IO.FilePath's Interfaces

  • IEnumerable<string>: the sequence of path components making up a full path.
  • IEquatable<FilePath>: compare two paths for equality
  • IComparable<FilePath>: order two paths alphanumerically

Comments

Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre...

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD) , and decided to take a crack at distilling its essence in C#. Note that automatic differentiation (AD) is different than numerical differentiation . Math.NET already provides excellent support for numerical differentiation . C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries . Neither of these are suitable for learning how AD works. So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique. What is Automatic Differentiation? Simply put, automatic differentiation is a technique for calcu...

Easy Reverse Mode Automatic Differentiation in C#

Continuing from my last post on implementing forward-mode automatic differentiation (AD) using C# operator overloading , this is just a quick follow-up showing how easy reverse mode is to achieve, and why it's important. Why Reverse Mode Automatic Differentiation? As explained in the last post, the vector representation of forward-mode AD can compute the derivatives of all parameter simultaneously, but it does so with considerable space cost: each operation creates a vector computing the derivative of each parameter. So N parameters with M operations would allocation O(N*M) space. It turns out, this is unnecessary! Reverse mode AD allocates only O(N+M) space to compute the derivatives of N parameters across M operations. In general, forward mode AD is best suited to differentiating functions of type: R → R N That is, functions of 1 parameter that compute multiple outputs. Reverse mode AD is suited to the dual scenario: R N → R That is, functions of many parameters t...