And especially having performant and actively maintained default choices built in. With C, as described in the post you responded to, you'll typically end up building a personal collection of dusty old libraries that work well enough for most of the time.
I think Rust projects will accumulate their own cruft over time, they are just younger. And the Rust ecosystem's churn (constant breakage, edition migrations, dependency hell in Cargo.lock) creates its own class of problems.
Either way, I would like to reiterate that the comparison is flawed at a more fundamental level because hash tables and B-trees are different data structures with different performance characteristics. O(1) average lookup vs O(log n) with cache-friendly ordered traversal. These are not interchangeable.
If BTreeMap outperformed his hash table, that is either because the hash table implementation was poor, or because the access patterns favored B-tree cache locality. Neither tells you anything about Rust vs C. It is a data structure benchmark.
More importantly, choosing between a hash table and a tree is an architectural decision with real trade-offs. It is not something that should be left to "whatever the standard library defaults to". If you are picking data structures without understanding why, that is on you, not on C's lack of a blessed standard library (BTW one size cannot fit all).
> If BTreeMap outperformed his hash table, that is either because the hash table implementation was poor, or because the access patterns favored B-tree cache locality. Neither tells you anything about Rust vs C. It is a data structure benchmark.
The specific thing it tells you about Rust vs C is that Rust makes using an optimized BTreeMap the default, much-easier thing to do when actually writing code. This is a developer experience feature rather than a raw language performance feature, since you could in principle write an equally-performant BTreeMap in C. But in practice Bryan Cantrill wasn't doing that.
> More importantly, choosing between a hash table and a tree is an architectural decision with real trade-offs. It is not something that should be left to "whatever the standard library defaults to". If you are picking data structures without understanding why, that is on you, not on C's lack of a blessed standard library (BTW one size cannot fit all).
The Rust standard library provides both a hash table and a b-tree map, and it's pretty easy to pull in a library that provides a more specialized map data structure if you need one for something (because in general it's easier to pull in any library for anything in a Rust project set up the default way). Again, a better developer experience that leads to developers making better decisions writing their software, rather than a fundamentally more performant language.
> the Rust ecosystem's churn (constant breakage, edition migrations, dependency hell in Cargo.lock) creates its own class of problems.
What churn? Rust hasn't broken compatibility since 1.0, over a decade ago. These days it feels like rust changes slower than C and C++.
> Either way, I would like to reiterate that the comparison is flawed at a more fundamental level because hash tables and B-trees are different data structures with different performance characteristics. O(1) average lookup vs O(log n) with cache-friendly ordered traversal. These are not interchangeable.
They're mostly interchangeable when used as a map! In rust code, in most cases you can just replace HashMap with BTreeMap. In practice, O(log n) and O(1) are very similar bounds owing to how slowly log(n) grows with respect to n. Cache locality often matters much more than a O(log n) factor in your algorithm.
If you read the actual article, you'll see that Cantrill benchmarked his library using rust's b-tree and hash table implementation. Both maps outperformed his C based hash table implementation.
> Neither tells you anything about Rust vs C.
It tells you rust's standard library has a faster hash map implementation than Bryan Cantrill. If you need a hash table, you're almost certainly better off using rust than rolling your own in C.
One point of clarification: the C version does not have (and never had) a hash table; the C version had a BST (an AVL tree). Moreover, the "Rust hash table implementation" is in fact still B-tree based; the hash table described in the post is a much more nuanced implementation detail. The hash table implementation has really nothing to do with the C/Rust delta -- which is entirely a BST/B-tree delta. As I described in the post, implementing a B-tree in C is arduous -- and implementing a B-tree in C as a library would be absolutely brutal (because a B-tree relies on moving data). As I said in the piece, the memory safety of Rust is very much affecting performance here: it allows for the much more efficient data structure implementation.
I wouldn't consider implementing a B-tree in C any more "arduous" than implementing any other notable container/algorithm in C, nor would making a library be "brutal" as moving data really isn't an issue. Libraries are available if you need them.
Quite frankly, writing the same in Rust seems far, far more "arduous", and you'd only realistically be writing something using BTreeMap because someone else did the work for you.
However, being right there in std makes use much easier than searching around for an equivalent library to pull into your C codebase. That's the benefit.
I don't often do this, but I'm sorry, you don't know what you're talking about. If you bother to try looking for B-tree libraries in C, you will quickly find that they are either (1) the equivalent of undergraduate projects that are not used in production systems or (2) woven pretty deeply into a database implementation. This is because the memory model of C makes a B-tree library nasty: it will either be low performance or a very complicated interface -- and it is because moving data is emphatically an issue.
Can you mention 3 cases of breakage the language has had in the last, let's say, 5 years? I've had colleagues in different companies responsible for updating company-wide language toolchains tell me that in their experience updating Rust was the easiest of their bunch.
> edition migrations
One can write Rust 2015 code today and have access to pretty much every feature from the latest version. Upgrading editions (at your leisure) can be done most of the time just by using rustfix, but even if done by hand, the idea that they are onerous is overstating their effect.
Last time I checked there were <100 checks in the entire compiler for edition gates, with many checks corresponding to the same feature. Adding support for new features that doesn't affect prior editions and by extension existing code (like adding async await keywords, or support for k# and r# tokens) is precisely the point of editions.
And especially having performant and actively maintained default choices built in. With C, as described in the post you responded to, you'll typically end up building a personal collection of dusty old libraries that work well enough for most of the time.