Hacker News new | past | comments | ask | show | jobs | submit login

To address the last point, there's a tradeoff, as always. There's two competing forces with a hash function: speed and distribution. We want the hash to compute as quickly as possible, but want as close to equal probability of each bucket being chosen.

Assuming you're looking for a 32-bit hash value, my lazy whiteboard hash function would probably be something like:

- for strings, xor the bytes a word at a time.

- for ints, just use the 32-bit value

- for objects, let them provide their own, and if none is provided just use their memory address (this assumes that the objects don't have equality operators... that gets more complicated. If an object implements an equality operator, it also needs to implement a hash operator.)

These rules are by no means optimal, but they'd probably do as good enough for something that's not super high performance.

It also bears keeping in mind, I learned C twenty years ago, at the tender age of 12. Self-taught from the book that came with the compiler floppy[1]. We wouldn't have Internet at home for another 2 years, so I would occasionally persuade my parents to take me to a local Internet cafe for an hour to slurp up as much as I could at 28.8kbps and scribble it down. Bit manipulation operators are pretty deep in my soul at this point.

---

So going back to the original discussion... that came pretty natural to me. "How would I implement a hash function?" The brain whirrs a little bit and says "hey, here's some things to think about". I am 100% aware that this isn't typical, and do not expect others to have the same experience.

In fact, if I were hiring someone, I would be totally satisfied with an answer of "I'm not sure how to implement a hash function off the top of my head. It takes the object/value that we're going to insert into the hash table and turns it into a number that we can use to index into the appropriate bucket. Can we look one up?"

Hell, I'd probably be satisfied with "Here's when I'd use a hash table, and here's when I'd use an array. I've never implemented a hash table but I'd love to know how it works under the hood."

For me when hiring, the two biggest things I'm looking for are the ability to decompose problems into manageable parts, and a curiosity about how everything works. Those two things together tend to make any other knowledge gaps totally manageable; I'm not omniscient, but my desire to learn as much as I can about anything I don't understand can sometimes give off that illusion :P.

[1] http://www.mixsoftware.com/product/powerc.htm




Well, here's the thing - I've been on a lot of interviews, and the majority of them wouldn't be satisfied with the answers you'd be satisfied with. If interviewers were the way you've described, I don't think the whole thing would be such an issue.

Overall, yeah, interviewers want to see that hash map largely written, often at a white board. Minor errors and bits of pseudocode are ok, but it should pretty much work.

YMMV, of course, but it isn't just me.

http://steve-yegge.blogspot.com/2008/03/get-that-job-at-goog...

"Hashtables: hashtables are arguably the single most important data structure known to mankind. You absolutely have to know how they work. Again, it's like one chapter in one data structures book, so just go read about them. You should be able to implement one using only arrays in your favorite language, in about the space of one interview."

Of course, Steve Yegge isn't the final word on prepping for google interviews, and I only had one experience there myself, but yeah, I'd say he's pretty spot on here (in fact, the google recruiter who arranged my interviews forwarded me the Yegge blog post, though I'd already read it earlier).


for ints, just use the 32-bit value

I assume that this should be value modulo m, where m is actual size of the backing array.

for strings, xor the bytes a word at a time.

Well, for strings the same, that is, final output modulo m.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: