Caching the uncacheable: CloudFlare's Railgun

mmaunder · on July 2, 2012

Exec summary: This is not caching. It is reduced latency for your visitors by removing connection setup over a large geographical distance. It also reduces the latency your web server has to deal with to effectively zero, but you can do that with Nginx. It also will reduce your bandwidth consumption from your providers perspective to a fraction of what it was.

I think describing this as caching hides the real benefits. My startup pushes 150 Mbps on average so I care about this stuff.

Rather than being caching, this introduces a proxy server that is geographically close to your visitors that then communicates with your server using an efficient compression protocol. So, much of the network path that would transfer a full payload is replaced by a compressed connection that does not have to build up and tear down a TCP connection with a three way handshake. The most important benefit I see here for site visitors is reducing latency by removing the connection setup. They will spend a few ms setting up the connection with a server a few hundred miles from them instead of on the other side of the planet. That server then serves a cached page or uses an established connection to only get the changes, which could mean as little as a single send and single receive packet.

Another benefit of this is that your local web server will be talking to a local cloudflare client which means there is practically zero latency from your perspective for each request. This means that each of your app server instances spends less time waiting for it's client to send or receive data and more time serving app requests. It's why people put Nginx in front of Apache.

I think the most important cost benefit here is reducing your bandwidth consumption. We're constantly negotiating our colo deal based on 95th percentile and getting your throughput from 1 Gbps down to 50 Mbps which I think this may do will drastically reduce your real hosting costs. Of course Cloudflare need to maintain their servers and will be serving 1Gbps to your customers but those cloudflare servers will be geographically closer to your customers. However because data centers bill based on your throughput at the switch and not how far your customers are away from you, I don't see that there are any cost savings they (cloudflare) can pass on to you. They're going to be billed what you were being billed for bandwidth, but they'll mark it up. I suppose you could argue there are economies of scale they benefit from, but that doesn't seem like a compelling argument for reduced costs.

jgrahamc · on July 2, 2012

CloudFlare does not bill customers for bandwidth consumed. It's a flat fee: https://www.cloudflare.com/plans

mmaunder · on July 2, 2012

Holy cupcakes. I might use this. So I can use your free package which includes the CDN to serve 100 Mbps of static JS and images?

jgrahamc · on July 2, 2012

Yes. And we'll also send you some cupcakes.

mmaunder · on July 2, 2012

I'm curious if I'm going to run into a catch. It seems unsustainable. Just to be clear, that's 30.8 Terrabytes of data I'll be transferring per month on your network for $0. Can someone verify this?

jgrahamc · on July 2, 2012

I'm not on the business side, but looking at our global traffic 30TB per month is a tiny percentage of what we are doing in terms of traffic.

I doubt it would go unnoticed, though, so I expect someone will be interested in persuading you to get a Business account with us which is $200/month because at that point we give you an SLA.

rdl · on July 2, 2012

If they go down, you go down. But in my experience they have been quite reliable; if you aren't already in 3 sites on independent networks for HA, they are better.

I'd spring for the $200/mo service if you are pushing real traffic just to get to try the railgun. $200 is less than epsilon for a large site.

Also, the cupcakes are probably a lie, or at least are vegan.

Ralith · on July 2, 2012

What do you have to say to http://news.ycombinator.com/item?id=4188882 ?

jgrahamc · on July 2, 2012

I don't know who that is, or what site they had on CloudFlare and so I can't comment on it.

druiid · on July 2, 2012

A client I built the infrastructure for pushes about 50Mbps. Not up to your level, but I can at least give some suggestions and input for Cloudflare. We basically switched all of our sites to use Cloudflare after running a few of the largest through them for a year plus.

During that time there was a total of one Cloudflare related outage and that was resolved within about 15 minutes by their changing the data-center the site(s) were routed through. I can tell you that one of the greatest benefits you will see with Cloudflare as it stands currently is that your bandwidth utilization is going to go down substantially. Before switching to Cloudflare we were pushing a good deal more than 50Mbps. Essentially if you were to switch I have to imagine your side of the bandwidth utilization is going to drop to somewhere around 75-90Mbps if not more.

That said, understand what you're getting into. This is a 'cloud' service and they require you to switch your DNS records to their service. All things considered, running a multi-million dollar business through them has been much smoother than anticipated... this new feature we will be looking at very carefully as well because about half or more of our content cannot be cached.

0xbadcafebee · on July 2, 2012

It's still caching because your servers are only serving a small portion of the data that's changed between requests. But it's very nice that they're now taking on all of the customer requests, reducing your exposure to DDoS's dramatically.

The whole 'freeing you up to serve more requests' thing is not accurate: your app servers run as fast as they can and your frontend proxies deal with handing the data to the client, so your app servers are (or should be) always doing as many requests as they can. If anything, the reduced latency and caching will allow more connections than usual to come in, putting more potential load on your app servers. Catch-22 =)

"This means that each of your app server instances spends less time waiting for it's client to send or receive data and more time serving app requests. It's why people put Nginx in front of Apache."

Sounds silly to me. Putting a proxy in front of a proxy doesn't change the tcp/ip stack. If you tuned your network stack and Apache properly it should be able to handle anything you throw at it. I don't remember what the setting was, but modern versions of Apache should be able to only send a request to the app server once the client has finished its request to the frontend.

pbreit · on July 2, 2012

I'm not sure why you don't think it's caching. They are doing more than just sending the whole page over a preferenced connection.

Igal_Zeifman · on July 4, 2012

Nothing new, this is all just repackaging. This is all just basic CDN proxy tech all providers (including CF) had for years. Giving it a cool name, dose not make it new or exiting.

Here are some helpful links - CF customer testimonies:

http://x-pose.org/2012/02/speed-up-your-site-disable-cloudfl...

http://www.husdal.com/2011/07/01/incapsula-versus-cloudflare...

alt_ · on July 2, 2012

Delta encoding for HTTP has been proposed since 2002[0], but seems to have been lukewarmly received.

A fully server-side solution will probably bypass most problem cases, but might solve some, so I hope CloudFlare looks into contributing to a distributed standard solution.

[0] http://tools.ietf.org/html/rfc3229

obtu · on July 2, 2012

This deals with the protocol and leaves the diff tool pluggable (RFC 3284's vcdiff is suggested), so it looks like something railgun could be using internally.

Google's SDCH (mentioned in CloudFlare's post) is an alternative that isn't tied to a single URL, but involves prefetching some data that may or may not be needed, so it's a bit hacky.

An interesting approach would be to hook into a template engine to generate a page skeleton with dictionary references. That would allow reordering the small, newly generated varying snippets before the large common ones; both could be preemptively pushed using SPDY to avoid round-trips.

jgrahamc · on July 2, 2012

The really big problem with SDCH is that is places a burden on the web site owner to generate dictionaries and to know when to generate them and how. The protocol is one thing, the reality is another and that's why it has not been widely adopted.

justincormack · on July 2, 2012

Which more or less describes ESI though, a page template with bits to fill in...

jgrahamc · on July 2, 2012

And for those that care about this sort of thing: Railgun is about 4,000 lines of Go.

wickedchicken · on July 2, 2012

> Railgun is about 4,000 lines of Go.

Probably going to get downvoted, but http://www.reactiongifs.com/wp-content/gallery/yes/shaq_YES....

reitzensteinm · on July 2, 2012

I'm having a hard time wrapping my head around the benefit of this - maybe John can elaborate some more.

Taking the CNN front page for example. If you set a TTL of 60 seconds, and you have 14 edge locations (taken from the CloudFlare about page), you've got to satisfy 1440 * 14 = 20160 requests a day. The CNN page is currently 97527 bytes, which gzips down to 20,346 bytes. That's 391 megabytes per day. Serving the edge locations even with this relatively short TTL is trivial.

Now, the TCP connection means the content can be pushed and not pulled, so latency is better. It also means that caching a lot of pages will become cheaper (though still expensive).

But it doesn't seem like a lot of benefit for essentially replacing HTTP (between the content servers and the edge nodes), for all the proprietary software and vendor lockin that entails. For each byte you're sending to edge nodes, it's going to be served up orders of magnitude more from the edge nodes to end users, which seems like where almost all the cost would lay.

I'm sure they know what they're doing, and that their customers have asked for this, but think some real world case studies would help make it click. That the blog post is going on about 'caching the uncacheable' really doesn't help.

jgrahamc · on July 2, 2012

Firstly, we're not 'replacing HTTP'. HTTP runs over Railgun in the same way that it does over SPDY and the content servers are still running HTTP.

The big benefit is not in terms of bandwidth saved (for us) it's in terms of total time to get the page. That's partly driven by latency and partly by bandwidth. Because we have worldwide data centers we can see high latency from say the data center in Miami and a web server located in Sydney. Railgun helps with that problem.

Also, CNN has a TTL of 60s but many, many web sites have a TTL of 0 because they want no caching at all (see New York Times web site) or because the page is totally personalized.

marcusf · on July 2, 2012

I'm still not sure how this solves the personalization use case. Usually (I think) the main problem is rendering the page, not pushing it down the wire. So if you still need to render for each request for a personalized page to make sure it's not been updated, where's the gain?

I think I might be slow, but I'm not sure where the wins are here and how you overcome the personalized rendering problem? I'd love for you to talk a bit about it.

stavros · on July 2, 2012

It seems that all the gains are at the point between the upstream server and the CDN node. It has to send fewer packets for the entire page. However, I'm not sure this isn't pretty much as efficient as a large TCP window. Of course, I'm not the one whose job it is to worry about these things, so take this with a grain of salt.

kokey · on July 2, 2012

I've noticed that using bzip, cranked up to it's max level, is really effective for compressing dynamic HTML. I've managed to achieve what I suspect is comparable to Railgun by simply having this compression applied on a multiplexed TCP tunnel in rather large blocks, so that packets from multiple sessions are compressed together. It worked really well, it enabled me to offer reasonable fast content delivery to clients with servers on small leased lines in Africa. I found it a hard service to sell, it takes a lot of explaining ;-)

jacques_chester · on July 2, 2012

If understand this correctly:

1. You install "Railgun" on your publishing web server

2. It pushes deltas of your webpage to CloudFlare

3. Who then update their cache of your webpage across their CDN.

It seems as though folk are starting to see that in most environments caching ought to be driven by POSTs, not GETs + timeouts.

nkurz · on July 2, 2012

I don't think there's any push involved:

Each end of the Railgun link keeps track of the last version of a web page that's been requested. When a new request comes in for a page that Railgun has already seen, only the changes are sent across the link. The listener component make an HTTP request to the real, origin web server for the uncacheable page, makes a comparison with the stored version and sends across the differences. The sender then reconstructs the page from its cache and the difference sent by the other side.

I take this to mean that there are two chained proxies, each proxying pages on a per-user basis. Since the upstream server-side proxy knows what the downstream client-side proxy has cached, it can send a very efficient shorthand describing how the page has changed without having to resend the information that's already been sent.

I think it's a smart good approach. So long as the origin-proxy is inside the datacenter, it clearly would save a lot on data charges. But I'm surprised the speedup is as much as JGC reports since you still have to pass the full page over the last-mile to user. I would have thought that was the slowest link. Is the core internet so congested that this is not the case? I'm presuming the data center and the origin server have very good throughput, and that even a very short message would have the same latency.

jgrahamc · on July 2, 2012

The numbers I'm reporting here are for the speed of download across Railgun and across HTTP between the origin server and a CloudFlare data center. Not the final download time to the end user.

The issue with the end user number is deciding on what to report. We are currently rolling out a very large system for monitoring timing throughout our network and will be able (later this year) to report on actual end user timings to see how much Railgun makes a difference. Our goal (as usual) is to improve the end user experience because that makes our customers (publishers) happy. Railgun is one small part of improving overall web performance.

nkurz · on July 2, 2012

Not the final download time to the end user.

OK. Still a useful measure, but a less dramatic one.

The issue with the end user number is deciding on what to report.

I would think that time to show a slightly changed page after a "refresh" or "reload" would be appropriate. What are the other choices?

For pages that are frequently accessed the deltas are often so small that they fit inside a single TCP packet, and because the connection between the two parts of Railgun is kept active problems with TCP connection time and slow start are eliminated.

Just reread this part, and not sure I understand it. Yes, there are no extra packets between the proxies, but in the base case there is only a single proxy and hence no extra connection time to consider. I'd think even a very fast proxy would introduce more latency than a hop on a backbone router. Or are you indeed pushing the per-user delta to the data center in anticipation of the request?

jgrahamc · on July 2, 2012

"I would think that time to show a slightly changed page after a "refresh" or "reload" would be appropriate. What are the other choices?"

Well, what we really want to measure is the overall effect so that we can see how Railgun improves things in general. The 'refresh' time is interesting, but we're also interested in the network scale (how does user X downloading Y improve the speed for user Z downloading the same (but slightly different) page Y).

"but in the base case there is only a single proxy and hence no extra connection time to consider"

That isn't really the base case. The base case is that we need to go get the resource with a normal HTTP connection direct to the server.

judofyr · on July 2, 2012

> It seems as though folk are starting to see that in most environments caching ought to be driven by POSTs, not GETs + timeouts.

HTTP supports that too (through ETag and 204 Not Modified).

jacques_chester · on July 2, 2012

I meant in terms of server-side caching.

jddil · on July 3, 2012

204 is no content. Think you meant 304.

robryan · on July 2, 2012

Would love to see some real world figures on the difference in page loads. Intuitively it feels like a lot of overhead outside of the packets being saved between the host server and cloud flare.

jgrahamc · on July 2, 2012

I do have some numbers on load times based on testing we did with a hosting partner where we took a small number of their sites and tested using Railgun. We saw a speedup of between 2.77x and 4.78x on the page download time.

With a different hosting partner (and a different set of sites) we saw a page download time speedup of between 2.94x and 8.12x.

beaker52 · on July 2, 2012

Pushing 0.5kb opposed to 93kb to a downstream caching proxy isn't going to set the world on fire if you consider the data transfer speeds we typically achieve these days, especially if you also consider the processing overhead on both your network and the cache provider.

They're not really caching the uncacheable, either.

rdl · on July 2, 2012

If you're on the wrong side of a really long high latency link, it should make a big difference. Slow start, etc.

Leaving a persistent connection between the cache and the server obviously does most of it (especially with latency mitigation tricks/tcp acceleration). It would be interesting to calculate the benefits of nothing vs. a cache with an accelerated persistent tcp connection vs. deltas. I suspect it's something like 500 vs. 50 vs. 45, but every bit helps.

onetwothreefour · on July 3, 2012

Slow start is irrelevant in the context of a CDN.

Cloudflare is like a bunch of people suddenly realized what every CDN in the world with a dynamic acceleration product does, but then they blog about as if it's all magic and unicorns.

Every DSA provider maintains persistent connections to origin nodes. Every DSA provider runs a custom multiplexing protocol between the first and last mile POPs on their network. Nothing here is new.

The only moderately interesting thing about this is that they're sending X bytes instead of Y bytes once every 60 seconds. Meh.

beaker52 · on July 2, 2012

The time to first byte latency due to long distance is going to be the same whether the payload is 2kb or 100kb.

If you have servers with a very low throughput outbound connection to the cache, then reducing the data transfered in this manner could be worth it. But 100kb over the wire at todays throughput is not going to add all that much latency to the whole transaction. As you suggested, some figures would be nice though.

rdl · on July 2, 2012

First byte, yes, but won't you get the first 100 KB a lot faster due to being effectively 20ms away from it, vs. 320ms, thanks to slow start? Maybe little impact at 1KB, but for a 4MB abortion like some of the web pages I've seen, ...

(This all from TCP acceleration, not deltas, though. Deltas might give you 1ms on a 1G link if it saves you sending 100KB. BFD. Deltas to the edge, where you might be constrained by bandwidth on a mediocre 3G connection, is where deltas would rock - coupled with SPDY and tcp acceleration and caching and we'd be living in 2015.)

beaker52 · on July 2, 2012

Remember that we're talking about content provider to content cache here. Server infrastructure to server infrastructure, usually connected by huge pipes.

Passing deltas to a client over a mobile data network would be an awesome development, I agree. With some additional specification to HTTP and vendor implementation on clients, it'd definitely be possible.

rdl · on July 2, 2012

For some values of "huge". I was using 1Gbps as the link size, since that's almost certainly the uplink on the server, and thus an upper bound on the smallest link.

It would possibly be fair to use something closer to 155Mbps in a lot of places (the most constrained part of the link; we're not even talking about congestion/packet loss, which would exponentially favor this technique, and which does happen on SP transit links sometimes). At that point, 4MB could actually matter:

155Mbps = 20 MB/sec. 4MB takes 200ms to transfer; 100KB takes 5ms. If you assume a single packet for the delta instead, I'd be happy to save 5-200ms.

rdl · on July 2, 2012

Assume for the sake of argument that processing is free and instantaneous, and that the cache is in-line with the normal network path (and much nearer to the end user than the content server). This is arbitrarily close to true as Cloudflare expands.

So you're actually saving these 5-200ms vs. non-cloudflare delivery. It might still be 305-505ms total page load time, or easily greater (I used GEO sat for a while, feeding cell; it was hell), but if it were 305 instead of 505 I'd be quite happy actually.

(I'm assuming a 4MB page with 100KB which actually changes between loads. A 4MB all dynamic page, frequently reloaded, which can't be cached but where a tiny delta is possible would be pretty pathological.)

With CDNs you can also get benefits from prefetch/prefill, either opportunistically (due to multiple users hitting the same thing, or through actual scheduled fill). Works for many media assets but not for dynamically generated pages, which is what Railgun is supposed to address.

beaker52 · on July 2, 2012

That's 5-200ms from the upstream server to the downstream cache, in the event the content needs up to be fetched. This would only be affect requests for new content.

If it'd be from upstream server (or cache) to end clients, that'd be an awesome saving.

omh · on July 2, 2012

As a small company, using CloudFlare as a cloud-based CDN/firewall/DDoS protection looks very attractive.

The thing that has stopped me from trying it is that you have to move your whole DNS to them though, and they have had downtime for DNS as well as their hosting. I'm comfortable with serving www.example.com through them and dealing with the occasional downtime. But I'm not so comfortable with downtime for MX records - I really want mail@example.com to be super reliable.

luigi · on July 2, 2012

I launched a startup in late March, using Cloudflare, and had many similar thoughts. But it was an awful experience.

Because Cloudflare acts as a proxy, it gets in your way in subtle but devastating ways. First, the SSL support wasn't stable and I turned it off a few days into launch. That probably killed some traffic.

Then, I realized that caching was an all-or-nothing proposition. You can have Cloudflare cache your assets, but it doesn't seem to respect expires headers. You can set a cache time in the web interface, but the minimum is 2 hours! So you end up creating a rule to have it cache no assets, at which point you might as well use AWS Cloudfront for a CDN.

I advise you not to use Cloudflare. It's a great idea, but the execution just isn't there.

luigi · on July 2, 2012

Here's a thread with the bad experiences other HNers have had:

http://news.ycombinator.com/item?id=3856004

atombender · on July 2, 2012

> You can have Cloudflare cache your assets, but it doesn't seem to respect expires headers.

Why do you expire assets at all? If it's because you are deploying new versions of the app, would it not be better to solve this by using revision-stamped URLs? When you deploy a new version of your app, just change the stamp and let the old assets expire.

This has the secondary benefit of ensuring that all application-served pages use the assets the page was designed for, ie. avoiding the race condition when a user gets served a page from app version 1, but gets the asset for app version 2. (If app version 2 has changed an internal API used by the JavaScript, for example, this will result in actual app errors, not just a page that looks a bit weird due to mismatched CSS rules.)

Invalidation is a challenge with any kind of caching system. Invalidating passively by changing the cache key means that the cache has to keep more old cruft around until it expires naturally, but it's easier to implement and works around the race condition problem.

rdl · on July 2, 2012

I've had pretty good luck with them, although mainly for sites which were getting DoSed otherwise.

I actually haven't experimented with the SSL support yet. (we've been working on some better ways to handle SSL ourselves, which might be interesting for something like Cloudflare). I probably would pick SSL over Cloudflare if forced to decide between the two, but the $200+/mo plans support real SSL certs you provide, including EV, so I'd look at that.

I generally go with longer expires headers, and have dev vs. production sites. I don't use cloudflare for any dev sites. I believe you can manually expire everything if some catastrophe happens; you could also do that as part of the update process.

ksec · on July 2, 2012

I have similar concern as well. Is there any way to use Cloudflare with other DNS like DNS Made Easy?

sirwitti · on July 2, 2012

This looks like an interesting option for static content. But what still is not cachable of course is every request that has to hit a database (or at least clear parts of a cache).

In my experience (e.g. a site with 50.000-100.000 pageviews/day) it's mostly the requests hitting the db and not server throughput that are the bottleneck.

So for some people this could solve the wrong problem...

davedx · on July 2, 2012

Pretty cool tech. I guess this can work in parallel with SPDY too. The Internet is evolving a lot lately.

kierank · on July 2, 2012

So basically this is Edge Side Includes reinvented?

eastdakota · on July 2, 2012

Kinda, but without the pain of having to rewrite your pages, and with a much higher efficiency. Actually have a blog post about this queued up to go out tomorrow.

piotrSikora · on July 2, 2012

No, because it's transparent for the content-provider and it doesn't require any changes to the website and/or it's code.

mycodebreaks · on July 2, 2012

Do any other CDN providers have similar technology as Railgun?

cbsmith · on July 2, 2012

Good question. The answer should be obvious, as CloudFlare presents this for the innovative and ground breaking technology it is...

[end sarcasm]

So yeah, basically almost all of them have it. Most have had it for years. CDN and various other business models (POP's, etc.) have been converging for a while now, so this has become pretty standard for CDN's to provide.

ksec · on July 3, 2012

Really? Well may be they are for the big boys, none of the commonly used CDN advertise it, All Hosting providers reselling Level 3, Akamai, EdgeCast, NetDNA or even Amazon Cloudfront dont show it either.

I wonder why these aren't available to normal users.

cbsmith · on July 4, 2012

Not sure about Amazon and NetDNA, but the rest certainly offer it.

Igal_Zeifman · on July 4, 2012

Exactly right. This all just marketing spoof a way to justify recent re-distribution of paid plans. For those who consider CF, know that they are a good company, but they should focus of product quality and not just of Marketing and PR. For a better alternative I advise to check Incapsula. Here is a head-2-head user review of both platforms. http://www.husdal.com/2011/07/01/incapsula-versus-cloudflare...

bigiain · on July 2, 2012

I wonder if there's a way to replicate/simulate this with a longer ttl on the pages served out of your cdn, and a bit of javascript that pings a lighter weight server only asking for deltas, and updates the page in place (or reloads the entire page from the cdn when the deltas get too big).

bluelu · on July 2, 2012

So how does this work if you want to count the number of visitors, delivering customer specific versions of the site etc...? Does the bbc example even work because I can imagine that there multiple versions of the site depending on which IP is accessing BBC

stavros · on July 2, 2012

I'm guessing the visitor IP still gets passed. This looks to be entirely synchronous, so everything would still work. The CDN just only has to fetch a delta of the page rather than the whole.

TazeTSchnitzel · on July 2, 2012

Well, one obvious thing with BBC News is that ads aren't shown for UK visitors but are for international visitors.

zobzu · on July 2, 2012

so railgun is internally sending diffs. I would have thought caching companies were doing that already, but it seems I'm wrong.

That being said, I'd have used another name ;à

hucker · on July 2, 2012

This looks rather interesting... Does any fully server-side solution such as varnish have any similar functionality? Seems like a no-brainer.

gizzlon · on July 2, 2012

Well, since those caches run on the same LAN as the webserver, the cost would probably outweigh the benefits.

You can still break up your page into pieces and cache all but the once that change. Believe this is one of the reasons you would use ESI: http://en.wikipedia.org/wiki/Edge_Side_Includes

_vafj · on July 2, 2012

hope they make it available for CloudFlare Pro accounts too.. wait to go cloudflare ..

ta12121 · on July 2, 2012

Differential compression? Isn't this what Opera Turbo and other web accelerators do?

ldawoodjee · on July 2, 2012