Skip to main content

Expressiveness: what's all the fuss?

I just read a blog post recommending that developers broaden their language horizons, with a particular emphasis on Ruby. The author attempts to explain why expressiveness is an important metric for a language:
What is this obssesion[sic] with "expressiveness"? Go write poertry [sic] if you want to be expressvive.[sic]
Remember that ultimately our jobs are (usually) to solve some kind of business problem. We're aiming for a finish line, a goal. The programmer's job is translate the language of the business person to the language of the computer.

The whole point of compilers, interpreters, layers of abstraction and what-not are to shorten the semantic distance between our intent and the way the computer thinks of things.

To be honest, this is not very convincing; the moment you mention "semantics", is the moment many developers will close your blog and go do something "productive".

The argument for expressiveness is ultimately quite simple: the more expressive the language, the more you can do in fewer lines of code. This means that 3 lines of Ruby code might take 12 lines in C#, and 15 lines of C# could be compressed to 2 lines of Ruby while retaining readability and maintainability [1].

Consider viewing Ruby as a C# code generator, where 2 lines of Ruby code can generate 12 lines of C#. That actually sounds like a pretty good idea doesn't it? You would never write a generator to go the other way around though would you?

More expressive languages also tend to be simpler and more coherent. There are all sorts of little ad-hoc rules to Java and C# that you would not find in most functional languages.

You can readily see the differences in expressiveness at the Alioth language shootout. Set the metrics to CPU:0, memory:0, GZIP:1, which means we only care about the GZIP'd size of the source code. You'll see that functional languages tend to come out on top. Ironically, Ruby is first lending weight to the above blog post on Ruby's expressiveness.

Expressiveness is the whole driving force behind DSLs: a DSL is more expressive in solving the problems it was designed for. For instance, a relational database DSL specific to managing blogs could generate perhaps 100 lines of SQL per single line of DSL code. It would take you far more code to emulate that 1 DSL line in C#.

So the expressiveness argument is quite compelling if you simply view it as a code generation metric: the more expressive the language, the more code it can generate for you for the same number of characters typed. That means you do your job faster, and more importantly, with fewer errors.

Why fewer errors? Studies have shown that developers generally write about 20-30 correct lines of code per day. That means 20 lines of correct C# code or 20 lines of correct Ruby code. It just so happens that those 20 lines of correct Ruby can generate 100 lines of correct C#, which means you're now 5x more productive than your C#-only developer next door. Do you see the advantages now?

[1] These numbers are completely bogus and are used purely for illustrative purposes.


Isaac Gouy said…
I see as many "functional" langauges in the bottom half as the top half of that comparison :-)

What's really obvious is that unsurprisingly all the scripting languages pack into the top 10.

Incidentally a direct link would work better: ranking
Sandro Magi said…
Didn't notice your post until now! Seems your link is to a CPU comparison, not the source code expressiveness comparison I was referring to; here's a direct link.

It is interesting that some strong functional languages score so low, ie. Haskell and Clean. Perhaps it has to do with type annotations they need to infer higher-ranked polymorphic types.

I think your point about scripting languages is also noteworthy, and I would hazard that they score well for two reasons:
1. pervasive use of short symbols for operations
2. lack of static typing permits a looser program structure
Isaac Gouy said…
It is interesting to see that "some strong functional languages score so low" given that you wrote:

Set the metrics to CPU:0, memory:0, GZIP:1, which means we only care about the GZIP'd size of the source code. You'll see that functional languages tend to come out on top.


Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD) , and decided to take a crack at distilling its essence in C#. Note that automatic differentiation (AD) is different than numerical differentiation . Math.NET already provides excellent support for numerical differentiation . C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries . Neither of these are suitable for learning how AD works. So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique. What is Automatic Differentiation? Simply put, automatic differentiation is a technique for calcu

Building a Query DSL in C#

I recently built a REST API prototype where one of the endpoints accepted a string representing a filter to apply to a set of results. For instance, for entities with named properties "Foo" and "Bar", a string like "(Foo = 'some string') or (Bar > 99)" would filter out the results where either Bar is less than or equal to 99, or Foo is not "some string". This would translate pretty straightforwardly into a SQL query, but as a masochist I was set on using Google Datastore as the backend, which unfortunately has a limited filtering API : It does not support disjunctions, ie. "OR" clauses. It does not support filtering using inequalities on more than one property. It does not support a not-equal operation. So in this post, I will describe the design which achieves the following goals: A backend-agnostic querying API supporting arbitrary clauses, conjunctions ("AND"), and disjunctions ("OR"). Implemen