Skip to main content

Posts

Showing posts from May, 2020

Dual/Codual numbers for Forward/Reverse Automatic Differentiation

In my last two posts on automatic differentiation (AD), I described some basic primitives that implement the standard approach to forward mode AD using dual numbers, and then a dual representation of dual numbers that can compute in reverse mode. I'm calling these "co-dual " numbers, as they are the categorical dual of dual numbers.
It didn't click at the time that this reverse mode representation seems to be novel. If it's not, please let me know! I haven't seen any equivalent of dual numbers capable of computing in reverse mode.
When reverse mode AD is needed, most introductions to AD go straight to building a graph/DAG representation of the computation in order to improve the sharing properties and run the computation backwards, but that isn't strictly necessary. I aim to show that there's a middle ground between dual numbers and the graph approach, even if it's only suitable for pedagogical purposes. Review: Dual NumbersDual numbers augment stand…

Easy Reverse Mode Automatic Differentiation in C#

Continuing from my last post on implementing forward-mode automatic differentiation (AD) using C# operator overloading, this is just a quick follow-up showing how easy reverse mode is to achieve, and why it's important.Why Reverse Mode Automatic Differentiation?As explained in the last post, the vector representation of forward-mode AD can compute the derivatives of all parameter simultaneously, but it does so with considerable space cost: each operation creates a vector computing the derivative of each parameter. So N parameters with M operations would allocation O(N*M) space. It turns out, this is unnecessary!Reverse mode AD allocates only O(N+M) space to compute the derivatives of N parameters across M operations. In general, forward mode AD is best suited to differentiating functions of type:RRNThat is, functions of 1 parameter that compute multiple outputs. Reverse mode AD is suited to the dual scenario:RN → RThat is, functions of many parameters that return a single real …

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD), and decided to take a crack at distilling its essence in C#.Note that automatic differentiation (AD) is different than numerical differentiation. Math.NET already provides excellent support for numerical differentiation.C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries. Neither of these are suitable for learning how AD works.So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique.What is Automatic Differentiation?Simply put, automatic differentiation is a technique for calculating the derivativ…