> What's your procedure for adding new nodes to increase capacity? Would you...

sigil · on Feb 22, 2011

Thanks for the info!

> I'm not sure how consistent hashing helps when adding nodes to a "real" DB, either ... How does consistent hashing help with the migration?

Instead of backfilling all data to an entirely new cluster, you'd only backfill the small amount of data from the keyspace "stolen" by the new node, and expire the keys at the original locations. If you use M replicas of each node around the ring (typically M << N) you only involve M+1 nodes in the migration process.

I'm still experimenting with this idea myself, and would also love to know if anyone's tried something similar with data store sharding (not just with caching).

bretthoerner · on Feb 22, 2011

> you'd only backfill the small amount of data from the keyspace "stolen" by the new node

I think this is the part I'm not so sure about.

Say I have 100 stats, and of course each stat is per forum, per day (going back from 1 day to ... 5 years?). How do I know what keys were just "stolen"? Do I have my new-node code hash every possible key (all stats for all forums for all hours for all time) to see which might go to that node? And then it reverses that key to know what it "means" to backfill it? (I need to do that followup post as the way our data 'flows' in is applicable here)

sigil · on Feb 22, 2011

> How do I know what keys were just "stolen"? Do I have my new-node code hash every possible key (all stats for all forums for all hours for all time) to see which might go to that node? And then it reverses that key to know what it "means" to backfill it?

Right, you'd have to iterate through all zset elements on the existing node, applying the consistent hash function to decide whether or not the element will be stolen by the new node.

If the element itself doesn't contain user id (or whatever you shard on) all bets are off.

sigil · on Feb 25, 2011

In a new post from antirez directly addressing your scenario, he suggests starting with a fixed but large number of shard instances on the same machine, then migrating the instances off to other machines using master / slave replication as needed.

http://antirez.com/post/redis-presharding.html