Hacker News new | past | comments | ask | show | jobs | submit login

I would make the point that C++ unordered_map is not remotely optimized for this use case, so it is not surprising that it is less efficient. You normally would not use it for something like this. In fact, C++ unordered_map is not tunable enough to offer excellent performance for most use cases. It is a very generic implementation of an unordered_map.

In any code base where performance matters, most important use cases for unordered_map would be replaced with a purpose-built implementation (like with GraphEngine). Not only is it pretty trivial to design and implement custom maps for various use cases, you can usually ensure at least a 2X improvement across all relevant metrics relative to the standard library unordered_map.

Based on the relative metrics for GraphEngine and C++ unordered_map in this article, I would expect a purpose-built C++ implementation to be significantly faster than GraphEngine. I know that is not what is being measured but it in real-world C++ implementations, it is how most properly engineered architectures would do it.




Indeed. std::unordered_* implementations are hogtied by the interface and complexity requirements in the C++ standard (it basically mandates chained buckets), and more generally the need to work with almost any hash function (linear probing would suck with crappy hash functions for example).

Jo Muñoz wrote a cool blog series on the current implementations in circulation, the flaws with the interface, and how his implementation (Boost multi_index) made further optimisations despite the constraints.

Don't miss the other two parts of the series - look under October on the right. He did extensive bookmarks in November as well.

http://bannalia.blogspot.co.uk/2013/10/implementation-of-c-u...


I bookmarked those posts a while ago in the hopes that I'll get a chance to rewrite VC's unordered_map in the future. We've taken the first step towards doing so by deprecating the non-Standard hash_map (which shares its guts with the Standard unordered_map) in 2015, so I can remove it outright at the beginning of the next development cycle. Then the unordered family can be changed without fear of breaking the hash family.


Thanks for what you do STL (For those that don't know, Stephan maintains Microsofts C++ standard library). Even though I don't develop against MSVC at all, improving the state of C++ at Microsoft makes life easier for all of us in terms of portability and bringing nice new things in to common use.

Thank you also for maintaining your GCC distro for Windows[0] and the the many great presentations you have given at CppCon, Channel 9, Going Native etc.

[0] http://nuwen.net/mingw.html


You're welcome! And even if you don't use VC's STL, hopefully make_unique and the other things I've designed will be helpful.


And for making this possible

    vector<int> v = {1,2,3,4,5}; 
    cout << accumulate(begin(v), end(v), 1, multiplies<>());


I'm curious..

Is it possible to use VIsual C++'s STL inside of Linux?


Nope, our sources assume they're running on Windows.


You might also consult the entries given below, which take into account some late improvements in Boost.MultiIndex not present at the beginning of the series:

http://bannalia.blogspot.com/2014/01/a-better-hash-table.htm... http://bannalia.blogspot.com/2014/01/a-better-hash-table-gcc... http://bannalia.blogspot.com/2014/01/a-better-hash-table-cla...


That's an admirable sentiment, but aren't your hands tied by the promises made by the standard? In particular the addresses of elements are guaranteed to be stable, which would seem to prevent an implementor from relocating them to achieve more cache-friendly layout.

(Of course, you would want to do as well as possible within the given bounds.)


Yeah, unordered_meow is always going to be a pretty crummy hash table. But our implementation could be significantly improved while still obeying the Standard.


Fair enough! Good luck with your rewrite.


While Graph Engine definitely looks useful, I agree that the comparison is unfair. GE is an optimized implementation while associative containers (unordered_*) in C++ are more generalized implementations. The above blog provided clearly explains optimized implementations of associative containers in C++. It would be interesting to see how these compare with GE.


Please note that this test also only captures a single profile of Graph Engine. The cell entry lookup table and the MM module are designed to operate in a more complex environment and not optimized towards the direction of this micro benchmark(concurrency, logging, a strong type system, to name a few). Of course, it would be very interesting to test against other options, especially those concurrent dictionaries combined with custom allocators.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: