In-memory compression is another technique not to be forgotten. If you can efficiently compress your in-memory representations of data you would otherwise have to hit disk or network for, you can dramatically increase the amount of data you can cache in memory, especially if you're aware of the structure and general patterns of the data.
Once I had to cache file paths, millions of them. I got a fairly easy 6x improvement by storing them in a Patricia trie, with each node's string stored as a range of UTF-8 encoded bytes, and indexed by arrays of bitfields - i.e. bit-packed arrays. Child nodes were similarly indexed using bitfield arrays. Turning what would be a 32-bit (or worse, 64-bit) pointer into (e.g.) a 28-bit integer really starts to add up when you have millions of them.
That they are used by Google makes them somewhat special, in that one would assume they have put some thought and experimentation into choosing them. Also, that would make for pretty good field-testing of them.
Any company building enterprise level products puts thought and experimentation into choosing their algorithms. Google just advertises their research publicly as part of their branding.
Thanks for extracting bmz into a standalone. I actually went searching for it after reading the article's comments last night and came across your link. One question I have is how up-to-date is this version of bmz? It says that it's from a Hypertable commit dated Jan 25, 2009. Would I be able to find a newer version in Hypertable's source?
Once I had to cache file paths, millions of them. I got a fairly easy 6x improvement by storing them in a Patricia trie, with each node's string stored as a range of UTF-8 encoded bytes, and indexed by arrays of bitfields - i.e. bit-packed arrays. Child nodes were similarly indexed using bitfield arrays. Turning what would be a 32-bit (or worse, 64-bit) pointer into (e.g.) a 28-bit integer really starts to add up when you have millions of them.