"An Efficient Representation for Sparse Sets" by Preston Briggs & Linda Torczon dates from Rice in 1993 covers the case where the index can be a direct map (for small key universes - like, say, the range of `char` or `short`, where one can be sure of no collisions/need to handle them).
The idea is simple enough that it could have been a subsection in some seminal early database work, but I also lack a reference for that.
Storing some (or all) of the hash code in the hash index part as this article mentions is a good idea I had independently. This is especially true if you do linear probing on the hash table part and care about worst case latencies. (You will basically never have > 2 full cache misses if the CPU BP/prefetcher can recognize a linear scan.)
It wouldn't surprise me. My memory is of two things:
- I was into E in the 90s and this idea was new to me then.
- I vaguely remember reading someone online mentioning that E was the known prior art to Hettinger's reinvention of this. But this dim memory is not reliable, so I wanted to check it.
And yeah, storing the hashcode is a technique I saw in the wild earlier than I saw the insertion-ordered hashtable.
The idea is simple enough that it could have been a subsection in some seminal early database work, but I also lack a reference for that.
Storing some (or all) of the hash code in the hash index part as this article mentions is a good idea I had independently. This is especially true if you do linear probing on the hash table part and care about worst case latencies. (You will basically never have > 2 full cache misses if the CPU BP/prefetcher can recognize a linear scan.)