The most egregious offenders here IMO are C++'s map and unordered_map. The spec ...

scottlamb · on March 11, 2020

> It looks like Rust is doing well on this. HashMap (the one everybody uses) is essentially a clone of Google's heavily-optimized C++ hash table

It didn't start out this way, but their spec was not so constrained as C++, so they were able to switch to this implementation in a backward-compatible way. In particular, Rust's std::collections::HashMap doesn't guarantee that keys and values have stable addresses across mutations of unrelated entries, so they can use open addressing rather than chaining. (And they did use open addressing from the beginning; the SwissTables-like "hashbrown" implementation was just a refinement of that.)

I think a lot of this is just that newer languages can learn from the mistakes of previous languages. It will be interesting to see how Rust (and other newer languages) are able to evolve when their standard library is 20+ years old and some parts don't age so well. (Although actually C++'s std::unordered_map only goes back to C++11 and still sucks...)

One part of Rust's answer I think is to keep the standard library small. Less there, less to screw up. Make it easy to pull in crates instead. Crates can supply the same functionality but can bump their major version relatively easily if the interface has to change. There are advantages and disadvantages to this approach...

saagarjha · on March 11, 2020

The default hash map probably doesn't need the iterator invalidation constraints, but they are quite useful in certain cases.

RossBencina · on March 11, 2020

> anything better than a red-black tree

Is there an equivalent data structure that has better worst-case time complexity than red-black tree?

Dylan16807 · on March 11, 2020

There are plenty of algorithms that are also O(logn) that are much faster in practice, because of how caches work.

RossBencina · on March 11, 2020

Could you name some please? I'm genuinely interested in learning of data structures with this property.

Dylan16807 · on March 11, 2020

The simplest one is a B tree. It's like a self-balancing binary tree, except that you put more than two children in each node.

Same big-O, but faster by a big linear multiplier.

vardump · on March 11, 2020

If the data is small enough (up to about n=100 magnitude), linear scan O(n) often beats asymptotically faster algorithms.

LessDmesg · on March 11, 2020

What, C++'s spec constrains implementation of standard datastructures? This language has hit a new low for me.

hoseja · on March 11, 2020

Well it defines the big-O of the datastructure APIs. And to match all the requirements, the implementation is usually very constrained.

leni536 · on March 11, 2020

My impression is that it's typically the iterator/reference stability requirements that lock imlementations down, not the complexity requirements.

hoseja · on March 11, 2020

Oh yeah, that too.