Wednesday 22nd April 2015 12.29am

Wed, Apr 22, 2015

As part of the day job I’ve started to learn the +Rust programming language (http://www.rust-lang.org/) and I thought I’d keep a reflective journal as I worked through the examples and started writing my first Rust code. #Rust #RustLang

Bear in mind in the following discussion I have just 20 hours of practice with no experience beforehand, and I’m coming from C/C++ background and as a +Boost C++ library developer and maintainer. As a result, I certainly don’t have experience with much of Rust and I may be a bit, or a lot, wrong in the following:

Pros:

* Rust is a true modern systems programming language, and is designed ground up as such and it shows. Unlike C++ which evolved over decades and still remains compatible with C and is therefore somewhat constrained by language design choices of the 1970s, Rust’s language design makes you write code with optimal performance almost always. In other words, you don’t need to have read and fully understand the compiler’s source code to write optimal code as you nearly do with C++ nowadays - there are so many non-obvious performance gotchas in C++ which is why it takes 10,000 hours to master. In particular, C++ always did far too much unnecessary copying around of memory because the language can’t disambiguate when not to be so conservative, and Rust is quite literally the opposite - it never copies memory unnecessarily, for example all types by default move not copy. Even references to types by default move and don’t copy - that’s right, setting a reference to another reference destroys the original reference unless you ask otherwise. Fantastic.

* There’s a ton of language tweaks to C/C++ which really ought to enter C/C++. Stuff like breaking multiple nested loops at once for example. Or that if statements are also expressions - as are while loops, and indeed every other flow control statement, in other words all flow control implicitly returns a value, a feature which you’d normally do in C++ by nesting flow control inside nested lambdas all of which is extra typing not necessary in Rust. Great stuff, and all non-breaking changes to C/C++ incidentally so C++ could add this without breaking existing source code.

* Language defaults are always to safest (except for error handling, see below), this is good. For example, everything defaults to const, you explicitly must ask each time for a mutable reference to something. Types move by default instead of copy which fits very nicely with the default const semantics. This also makes the optimiser’s life vastly easier than in C++ as it can elide chains of moves, plus const really does mean const in Rust whereas in C++ const is only a programmer annotation, and is otherwise ignored the compiler optimiser (due to const_cast/C casting/mutable existing!). Fantastic stuff this, shame we can’t default to move semantics in C++.

* Modules! Online repository of modules! Automatic dependencies! Seeing as C++ 17 Modules support appears to be going nowhere, I'd say this isn't getting fixed in the next C++ standard either :(.

* Rust makes you explicitly say when ownership of something changes, and the borrowing feature lets you temporarily lend something you own to something else (closest would a rvalue reference in C++) plus the original object is enforced to be unusable so long as a reference could still be open on it, which prevents two actors ever modifying a value concurrently without the compiler noticing. I won’t beat the horse to death on Rust’s ownership semantics as it’s a well discussed feature, and definitely a huge strength for the language over C++ as you get free thread safety plus the elimination of an entire (and expensive to debug) class of race condition which no longer can happen so long as you avoid unsafe regions (Rust lets you optionally turn off the safety). Moreover, unlike Go or Swift, Rust achieves all this safety without needing to fall back onto garbage collector tricks. This is an enormous boon for lowering maintenance costs in code expected to last for a decade or more, and it’s worth the (slightly frustrating) extra time required to write the code in the first place no doubt.

* Heap allocation is always explicit ("boxing"), this avoids how C++ will waste a lot of cycles using a lot of operator new for always unwound objects which is a big overhead sometimes.

* When working with containers, one usually works with slices which are views into a container. Furthermore, ranges are built into the language, including pattern matching ranges (e.g. everything but first four and last two elements). Great stuff, we won’t get anything like it in C++ until C++ 17 assuming Eric's proposal goes through.

* Static variant types are in the language! (called enums). And they are a particularly full fat implementation too, plus because they are static there is no runtime overhead, same as almost everything else in Rust and what makes Rust more of a C++ competitor than Swift or Go.

* Generics do not require a template keyword and much C++ template keyword syntax induced manual typing to do very simple operations because C++ templates were never originally intended for anything but cookie cutter class generation, and ended up getting used for a totally different outcome. Indeed, a match keyword in Rust implements partial template specialisation in a particularly compact way, and most of your control flow logic in Rust is matching instead of if…else so most of the time you’re programming flow control in terms of partial template specialisation.

* Traits and predicate type constraints are built in and ready to go. Again, predicate type constraints for C++ (called concepts lite) is only coming in C++ 17, though with lots of extra typing (enable_if et al) they are doable in today’s C++ compilers.

Cons:

* As much as everything being an expression saves typing, other parts of Rust 1.0 require a lot more typing unfortunately. It has virtually no automatic conversion semantics nor overloading, and hence requires you to spell out manually every single bloody conversion by hand - it won’t even auto convert a comparison of Option<T> to T. It won’t auto deduce nor unpack return types, you must manually specify those too. It won’t let you compare variants without you unpacking them by hand, which means you end up using type pattern matching all the time to work around this lack of automatic conversion and genericity (and see below why that’s bad).

* At least where control logic is involved one has type pattern matching to save typing, but the lack of overloading means you can only call functions exactly one way. An illustrative example is swapping two values of the same type, you might think this would be:

std::mem::swap(a, b);

But this won’t work because swap() only take mutable references. So now you try:

let x = &mut a;
let y = &mut b;
std::mem::swap(x, y)

But this still doesn’t work as expected because a and b become unusable so long as a reference is open on them, so in fact you need:

{
let x = &mut a;
let y = &mut b;
std::mem::swap(x, y)
}

… to kill the references to make a and b usable again after the swap. And this is but a small place where the lack of overloading or inference is a problem, though I must admit surprise as to why the standard swap function isn’t a generic taking any inputs (obviously with some constraints) which does the right thing.

This ultimately is a problem of the lack of free functions. In C++ you can modify, from the outside, how a body of code treats some types through externally overloading free functions for those types - this is one of a number of ways of statically supplying external implementations of routines to a library. In Rust 1.0 there is no ability to externally modify how generic code treats types supplied to it - you must supply policy implementations as inputs which is much more rigid, and of course requires much more typing.

Don’t get me wrong - overloading and especially free functions is a bag of unexpected bugs in C++ - not using std::enable_if with sufficiently generic template inputs can break code very far away from your library in ways impossible to work around for end users. My point isn’t that overloading is good and Rust should have it, no rather it’s that Rust 1.0 at present requires you to manually type out a lot of boilerplate almost all of the time instead of letting the compiler infer stuff for you, and it gets tedious.

* Partial type specialisation (called matching) is overused in Rust due to the lack of language support for anything better right now, a bad thing because it is too easy to get surprised by a change in type relations in one place having unexpected partial specialisation matching outcomes in other places. In other words, change ripples have all the same problems as in C++ for type pattern matching, but because Rust implements its switch statements and much of its if…else logic as type pattern matching the consequences are much worse here. I can see this being a real problem for scaling out Rust to very large code bases, because you change some small thing in one place and all sorts of compiler undetectable logic change consequences could result in unpredictable locations.

* Error handling in Rust 1.0 is a real missed opportunity. There is no ability to throw exceptions - I have no problem with that actually, it makes implementing the compiler and exception safe code vastly easier. But the default return type from a function is like in C, it can carry no error state which caused most major users of C to use the inefficient int errno thread local variable and for which there is no corollary in Rust. This is despite built in static no runtime cost variant types, so the fact this is missing is astonishing.

Most Rust enthusiasts will now point out that you can use Result<T, E> as your return type, and this is a monadic expected/unexpected value transport. This is what the Rust standard library does, and I have no problem with that either.

Where I do have a problem it that it should be the case is that all Rust functions and expressions always return Result<T, E> no matter what, so if you return Ok(U) or even U that pops out as Result<U, E> where E defaults to some system defined error type which can be specialised per type implementation. If code then tries to use any return value from a function, or any expression and it actually contains an E, then a try! should be implicitly in there (i.e. instant return from function propagating E to the next caller up until somewhere during the unwind someone actually handles E). What you would get is something a bit like structured exception handling, but without the problems of coping with exception throws during unwinds which is a fatal exit under C++. What it should not be is instant panic i.e. fatal exit the thread which is exactly what it is now when you try to use a result which contains an E not T.

My proposed scheme of course breaks an obvious design intention in Rust that all possible error conditions returned are part of the API contract because under my scheme some error type from some deeply nested function call needs to pop out at the top, and therefore functions’ error contract could no longer be well defined other than the error type is some unknown implementation of std::Error. I understand why they want this - a big strength of POSIX is that it lists all possible error codes which can come out of a system call, and that’s a great thing once you’ve spent some time getting surprised by rare and random failure on Windows not documented as being a possibility. However, Rust is not the operating system, and it should be a choice to tightly define error outputs like that.

I’ll probably get flamed by Rust users that the above is deliberately so for good reasons X, Y and Z related to similar rationales. I don’t care - right now implementing error handling in Rust is very like implementing it in C i.e. tedious. I want sensible default things to occur so most of the time I don’t need to write explicit error handling code every single time. I don’t want to constantly be writing error type conversion code to convert between every single function I call and the errors coming out of my function, because the default otherwise is effectively Rust calling abort() for you. I have to do that in C (which at least doesn’t call abort() on unhandled errors), and that’s why I don’t choose to program in C and if I have to, I accept a +20% time cost overhead for it.

Moreover, it’s a known cause of brittleness in big code bases when a language requires mandatory API error contracts because programmers, being lazy, simply either ignore or sink errors from APIs called internally (resulting in fatal application exit or errors getting lost) or type erase them, thus losing the actual cause of the original error and reporting an error upstream that has nothing to do with the actual cause. No, allowing unknown errors to bubble up is the only large code base scalable way to do error handling with average programmers working on it, but don’t misunderstand me, I have no problem with warnings/errors being generated by the compiler when an error state from a called API isn’t being dealt with somewhere in a call stack (i.e. if the error travels right up to the top, the compiler could spot this by emitting metadata with every function and doing link stage check). So what I want is much better defaults in Rust for error handling, and it not feel like a throwback to programming in C.

* The containers and algorithms in the standard library are still very 1.0. For example, I can’t see any way of getting BTreeMap (an always sorted map like std::map in C++) to do a closest find, and that is a huge use case for binary searched sorted containers. Instead you have to use a Vec, and use a slice of the whole container to run sort and run binary_search on the slice :(

* I am absolutely astonished they undid making Rust a tasklet based runtime i.e. like stackless Python with a language native coroutine M:N threading model, especially with the excellent actor based concurrency model already built in there which should have been perfect. Originally their i/o was based on libuv, probably the leading portable asynchronous i/o library written in C, so all their i/o was also async. A full fat coroutine + all async i/o new systems programming language ticking every box for a C++ replacement - Rust v0.12 sounds great, doesn’t it?

Unfortunately they ended up canning the greenlet support because theirs were slower than kernel threads which in turn demonstrates someone didn’t understand how to get a language compiler to generate stackless coroutines effectively (not surprising, the number of engineers wired the right way is not many in this world, but see http://www.reddit.com/r/rust/comments/2l0a4b/do_rust_web_servers_use_libuv_through_libgreen_or/ for more detail). And they canned the async i/o because libuv is “slow” (which it is only because it is single threaded only, plus forces a malloc + free per async operation as the buffers must last until completion occurs, plus it enforces a penalty over synchronous i/o see http://blog.kazuhooku.com/2014/09/the-reasons-why-i-stopped-using-libuv.html), which was a real shame - they should have taken the opportunity to replace libuv with something better (hint: ASIO + AFIO, and yes I know they are both C++, but Rust could do with much better C++ interop than the presently none it currently has) instead of canning always-async-everything in what could have been an amazing step up from C++ with most of the benefits of Erlang without the disadvantages of Erlang.

A huge missed opportunity I think, and sadly it looks like the ship has sailed on those decisions :(, and both untick pretty major boxes for some people including me. As it is a personal goal of mine to see AFIO become the asynchronous filesystem and file i/o implementation entering the C++ standard library to complement ASIO as the C++ Networking library entering in C++ 17, I can’t complain too much I suppose, it’s just it renders Rust as less likely to make C++ obsolete/encourage C++ to up its game.

Obviously with just 20 hours of experience the above is my opinion only and potentially wrong and misinformed, plus after a few weeks of more experience I may realise many flaws in Rust not realised yet (this usually is the case). Maybe I'll post here again on that.