Hacker News new | past | comments | ask | show | jobs | submit login
Redis at Disqus (bretthoerner.com)
153 points by tswicegood on Feb 21, 2011 | hide | past | favorite | 41 comments



Thank you for this post. This is what we need as a community to improve: use cases, and useful criticisms when things don't work well, so that we can find new strategies.

It's cool to see that Redis works well for many things, but it will be even cooler if diskstore, or any other approach, can made Redis more accessible even when the performance gain of being in-RAM is not enough for some kind of applications to justify the costs.

We are also working at cluster and faster .rdb persistence. So there are interesting things going, but fortunately we will have something new and stable in a few hours, as 2.2.0 stable is going live in very little time :)


Yeah, we can't wait for diskstore. If you imagine the analytics use case for a moment: super high speed is great, but I'd imagine 99% of the read requests don't ask for anything older than a month. Older data could easy be pushed out to disk, saving us a lot of RAM. For now we can still operate pretty easily in RAM (we have a few machines dedicated to analytics and they're just storing counters or sets of small values), but it'd be great to know we can grow a lot more without needing to put more of our shards on their own physical machines.

We already run the latest from the 2.2 branch, I can/should go into how easy that is in a followup.


Yes can be a good use case, but the new set of design decisions have new limitations, nothing is for free :)

Example: with VM there were a number of problems, like keys must be in memory, super slow persistence, and so forth, but there was no speed penalty if you always write against a small working set.

Instead with diskstore data is on disk. The RAM is just a cache, and if you configure store with 'cache-flush-delay 60' you are telling Redis that at max in 60 seconds every given key that is dirty should be flushed on disk.

If there are many writes it is easy to hit the I/O write speed limit, and the system starts to be I/O bound.

So diskstore is surely a solution when there is a big data problem where writes are rare compared to reads. If writes are really a lot, there is to consider the total I/O.

The ideal solution in your scenario is IMHO to take data about the latest N hours in an in-memory Redis instance. And move the historical data into a diskstore-enabled instance. This way you have a full win, as the diskstore instance will be used only for reads, so will provide the maximum benefits. While the in-memory instance will have the usual predictable and low latency characteristic of the usual default Redis configuration.


I'm thinking about doing a second post with some actual code (some parts may be specific to Python, Django, and Celery) if anyone is interested.


This was a great write up. Redis and the community would benefit from having more writeups like this, detailing different successful real world ways to use it. One can read the Redis documentation and imagine many uses for its different data structures and commands, but to read about tried-and-working practices is fantastic. Of particular use, in my opinion, is seeing key naming/organization schemes that people are using effectively.

I personally would love to see the code for these and other use cases.


Here is a blog post (not mine) that I keep returning to that contains a pile of links to different Redis uses/use cases. http://www.paperplanes.de/2010/2/16/a_collection_of_redis_us...


I'm definitely interested.

Also, out of curiosity, what do you use to render the actual charts? I'm working on an analytics package and can't decide on a charting engine that is clientside and reasonably performant.


We went through several different iterations of the line charts. Initially, I tried using SVG via Raphael, but it turned out to be too slow. Because it's possible to have a significant number of data points, manipulating and changing the SVG markup was causing too much of a it.

Eventually we settled on using Flot, which is a canvas-based charting solution. We made some changes to the core Flot code with some plugins to do things like change line color and fill on hover, but overall vanilla Flot served 90% of our needs.

Because Canvas is essentially a bitmap, number of data points have much less impact on the drawing layer.

Of note, we still use Raphael for pie charts, because, well, they look better and aren't affected by mass data point numbers.


Thanks for the info!

flot is what I settled on, too, but performance with a lot of datapoints was not so great. Maybe it was just too much data. Also, the timeline view was really annoying, but JS' Date function are to blame for that. But with Flash out of the question and Raphael even slower, I suppose it's the best option for now.


How many datapoints are you using? One things that I've done (though in the end, we didn't need it realistically) was use a resolution function to tune the datapoints to fit the canvas width.

If your canvas is only, say, 400 pixels wide, then any time series datapoints more then 400 will get lost -- there's simply not enough pixels to display them accurately. As such, you can use a resolution function to reduce that down to 400.


I have way more; I downsample to temporary mongodb collections and display the closest resolution for the available width.

I also found it hard to decide on a resolution function for analytics. Do we show the maximum of a time range? The average? Median? Min and max?


A mixture of (mostly) Flot with a touch of Raphael. I am but a lowly backend developer, I'll try to get my colleague dz to comment on the hows and whys.


Sure, although at least for me, these posts are most useful for:

1. showing the different kinds of applications people are using redis for;

2. as individual anecdotes that are slowly-but-surely adding up to real data.


Agreed, those are the same reasons I was happy to see the posts that came well before mine. People that know me well know I will poke holes in everything. We didn't decide on Redis (over tried and true PostgreSQL) for nothing. (We have also used and do use other NoSQL when truly applicable.) I love seeing smart use cases for new DBs.


What python client are you using? I have been mentioned few times about the performance.

If I remember, guys at bump did their own client in different language.


Yes please, I love reading posts like this... it makes me motivated to learn more about these software!


Definitely interested in both code and your Redis config files.


The most interesting thing about Redis is that it removes the impedance mismatch between in code data structures and the data store. It is doing for data stores what server side javascript does for AJAX applications. OO persistence was the first step in this direction but Redis nails the real world use cases a lot better.


Absolutely. We don't (currently) use it but amix's redis_wrap is an awesome example if you use Python. It basically makes interacting with Redis look just like you're dealing with normal builtin data structures. I'm sure the equivalent exists or is just as easy to write in any language: https://github.com/amix/redis_wrap


HN folks: would any of you be interested in a 'Getting Started With Redis' screencast?

Edit: if it were non-free :)


Yes, I'd pay. Especially if it was Ruby/Rails based too.


We have been using it for sessions (amongst tons of other stuff) at Shopify for half a year and found that we didn't have problems with increasing memory after we started setting expiration bits on the session keys.


When do you expire your sessions?

If I recall correctly, Django defaults to a two week expiry after the last session change. Remember that Disqus is a rather large network. We have many millions of "active" sessions by the definition above. The load of requests/second and number of sessions in VM really brought out the issues we ran into. I imagine it'd work just fine for sessions for most sites - but I still lean toward projects that specifically aim to be disk-backed k/v stores (membase, etc).


When aggregating stats in this manner (by Day) how do people deal with Time Zones?

For instance, if I have one user in, say, NZST, their "Tuesday, 22 February" is still "Monday, 21 February" in PST - and the real issue is that the buckets are off. So you can't just store in UTC and then move it by whatever timezone offset, as then you are grabbing different "buckets".

I don't think that explanation is very clear (I had to draw a diagram to figure it out myself). Hopefully someone smarter than I am can figure it out anyway.

We've worked around it by just storing hour aggregates, but I'm interested in case someone else has a smart solution :)


You're right, when your finest granularity is "per day" you lose any real sense of what a day means to the user (and when it starts and ends).

Right now we don't have a solution. As it turns out (I guess) our users don't mind. In the future we want to provide certain (or all) stats also in the "last 24 hours" (rolling) time frame, which will help. One benefit of our Redis-powered analytics is that they're live. Even if your idea of a "day" is different than ours, you can see the count/totals/averages/etc updating in realtime (relative to the "day" we decide on). So users can get instant feedback if something is happening, but for the most part only care about day over day stats (which make the TZ matter a lot less).


The simple solution is to store everything in UTC, but that means the smallest resolution you can store is 30 minutes (some time zones are :30 offset, and you ignore the Chatham islands that are :45 offset - http://en.wikipedia.org/wiki/Time_zones). It's really annoying and I have wished many times everyone would just use UTC.


Flickr decided that UTC would be the default for their stats. As long as you stick to it, it's not that big of a problem.


Not if you're trying to save RAM (and operations to fetch said data) by storing stats per day. You'd need all stats to be per-hour in order for it work for any timezone.


If you store by day UTC, you'd need two fetches to get a day in some other time zone. But if you store by hour, you need 24 fetches.


They can't in their (Disqus) case because they're aggregating all the stats per day into a single value. I guess they could do it 'per account TZ' since the account name is in the key, but that means TZ calc on each write (not that that will make a huge difference in perf).

For a generic solution with easy TZ calc, you need to aggregate your stats into hourly values instead (or half hourly if you care about those wacky non-aligned timezones). The increased fetches don't matter because you can just mget them.


Sharding: "We just take the modulo of the owning user's ID against the number of nodes we have to decide which node to read/write from/to."

What's your procedure for adding new nodes to increase capacity? Would you have to take your redis cluster offline to redistribute data from all nodes over the new keyspace?

I like the simplicity of your approach, but wonder if consistent hashing might be a bigger win in the long run.


> What's your procedure for adding new nodes to increase capacity? Would you have to take your redis cluster offline to redistribute data from all nodes over the new keyspace?

It's not easy, actually. The short answer is that we don't add capacity (because we've only needed to once, and we have tons of room to grow now). The long answer is that I have a switch I can flip that starts incrementing/adding data to a whole new cluster of Redis nodes while it still updates the old ones. We can then backfill all data to the new nodes and when they're setup, flip a switch to read/write only from/to the new nodes. It may sound a bit weird, mostly because it is. Moving sets of random keys from one node to the other while you're expecting live reads/writes is a huge pain, so I just punted on the problem.

(I elaborated a little more in a comment on my post: http://bretthoerner.com/2011/2/21/redis-at-disqus/#comment-1...)

> wonder if consistent hashing might be a bigger win in the long run.

I'm not sure that it's applicable. Consistent caching is really handy for caches when you don't want everything to miss as soon as 1/N servers drop out of the ring. You have to (imo) think of each Redis shard as a "real" DB. If your master PostgreSQL instance dies, you don't just start reading from another random instance and returning "None" for all of your queries. If a shard goes down, you either depend on an up-to-date read slave or nothing at all. I'm not sure how consistent hashing helps when adding nodes to a "real" DB, either. Say Node1 holds all of the data for CNN, you add a new node to the ring and now some % of CNN keys go to that new node. Now all of your writes are updating new/empty keys and all your reads and reading those new/empty keys. How does consistent hashing help with the migration?

(I'm really asking, because if I'm missing something I'd love to know.)


Thanks for the info!

> I'm not sure how consistent hashing helps when adding nodes to a "real" DB, either ... How does consistent hashing help with the migration?

Instead of backfilling all data to an entirely new cluster, you'd only backfill the small amount of data from the keyspace "stolen" by the new node, and expire the keys at the original locations. If you use M replicas of each node around the ring (typically M << N) you only involve M+1 nodes in the migration process.

I'm still experimenting with this idea myself, and would also love to know if anyone's tried something similar with data store sharding (not just with caching).


> you'd only backfill the small amount of data from the keyspace "stolen" by the new node

I think this is the part I'm not so sure about.

Say I have 100 stats, and of course each stat is per forum, per day (going back from 1 day to ... 5 years?). How do I know what keys were just "stolen"? Do I have my new-node code hash every possible key (all stats for all forums for all hours for all time) to see which might go to that node? And then it reverses that key to know what it "means" to backfill it? (I need to do that followup post as the way our data 'flows' in is applicable here)


> How do I know what keys were just "stolen"? Do I have my new-node code hash every possible key (all stats for all forums for all hours for all time) to see which might go to that node? And then it reverses that key to know what it "means" to backfill it?

Right, you'd have to iterate through all zset elements on the existing node, applying the consistent hash function to decide whether or not the element will be stolen by the new node.

If the element itself doesn't contain user id (or whatever you shard on) all bets are off.


In a new post from antirez directly addressing your scenario, he suggests starting with a fixed but large number of shard instances on the same machine, then migrating the instances off to other machines using master / slave replication as needed.

http://antirez.com/post/redis-presharding.html


> While the VM backend helped, we found that it still wouldn't stay within the bounds we set, and would continually grow no matter what we set. We did report the issue but never came to a good solution in time. For example, we could give Redis an entire 12GB server and set the VM to 4GB, and given enough time (under high load, mind you) it would climb well above 12GB and start to swap, more or less killing our site.

We came across this same issue while implementing a Redis-based solution to improve the scalability of our own systems. Someone filed an issue reporting this: http://code.google.com/p/redis/issues/detail?id=248.

Basically, antirez confirms that Redis does a poor job estimating the amount of memory used, so you'll need to adjust your redis.conf VM settings to take this into account. For anybody relying on Redis's VM, I'd recommend writing a script to load your server with realistic data structures with sizes you expect in production. You can then profile Redis's configured memory usage vs. the actual memory usage point at which swapping starts occurring, and set your redis.conf according to the limitations of your box. For example, we run Redis 2.0.2, and using list structures with ~50 items of moderate size, we found configuring Redis to use 400MB actually resulted in it using up to 1.4GB before swapping. We configure our settings to take this into account. Mind you, this may all change with diskstore, and later versions of Redis which are supposed to be more memory efficient.

For those curious, our Redis-based solution is helping us scale some write-heavy activities quite nicely, and has been running stably.


So frustrating to see all these people doing cool things with Redis and not having people free to do that stuff here. :) Any Redis hackers looking for a job or even some contract time? Big advantage is all the work will be open source and able to be shared and blogged.


I'm the author. What about a junior developer with an interest in Redis? This stuff isn't hard; I have recommendations. E-mail in profile. :)


I started experimenting with Redis for the last couple of weeks and I'm really loving it's power. I most rely on posts like these to find new ways to use it.

And your way of sharding definitely gave me more insight into distributing Redis across many nodes


dfgdfg




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: