S3 and High Availability

curun1r · on Aug 12, 2015

This post seems to perpetuate a common misconception that AWS users make when it comes to regions. Namely, that the purpose of regions is fault tolerance.

But the guidance we've been given by Amazon is that this is the purpose of availability zones, not regions. Regions are more appropriate for fighting the speed of light (i.e. locating your site closer to your users). As an illustration of this, Amazon told us that the US version of amazon.com runs in a single region.

Incidentally, the other interesting take away from that meeting was to avoid using autoscaling to respond to failures. This is because provisioning instances can fail when there's heavy demand and that's frequently the case when Amazon is experiencing outages in other regions and availability zones. Instead, we've been urged to provision 150% of what we need (50% in each AZ) so that if any one AZ goes down, we can still handle all our traffic. Where autoscaling works well is in responding to spikes in our own need rather than situations where many Amazon customers will have need.

Sorry for the digression, but I found that consultation interesting and it's clear that others have the same misconceptions that I had before learning the thought process behind the building blocks that AWS gives us.

mnutt · on Aug 12, 2015

If you're worried about a datacenter losing power, a multi-AZ strategy is fine. Hardware issues happen so often and with so little impact that we rarely hear about them. The issues that I'm primarily concerned about at this point are in software.

The S3 issue affected the entire region. A month back a BGP issue affected every one of our AZs in us-east-1. It wasn't even Amazon's fault, but a multi-AZ strategy would have done nothing for us.

It's kind of like Maslow's hierarchy of availability needs: at the bottom you deal with hardware failures, then networking failures, then regional configuration failures, and finally homogeneity issues where all of your machines fail at once because of a bug in hardware raid controllers or a leap second smear issue. It all depends on how much resilience you need and are willing to pay for.

pyre · on Aug 12, 2015

But Amazon has had regional failure before. How do availability zones within a region help, if Amazon has a catastrophic failure that is local to a region?

namecast · on Aug 12, 2015

At that point, you'll need to have multiple regions in play, and some sort of mechanism to direct traffic to regions when one becomes unreachable (for most people on AWS, this will be Route53).

Devil's advocate, though: if you're concerned about what to do in the event of an AWS regional failure, given how rare an event that is, then you've probably outgrown AWS.

(For most small-to-medium sized startups, I'd advocate setting up statuspage.io and keeping your users informed if you're single homed to an AWS region and that region experiences a catastrophic failure. The math on "how much money you'll lose from the outage" vs. "how much you need to spend implementing proper DR, better than what AWS has in place to keep a region up and running" isn't even close, assuming, say, 1 8 hour regional outage every ~2 years.)

jewel · on Aug 12, 2015

A B2B startup of any size will have a very hard time if those 8 hours are during business hours. Depending on the nature of the service, of course.

Where I work now (a small web-based software-as-a-service) an 8 hour outage would be catastrophic, and could easily kill the business. The switching cost for our niche is small, so one bad day could cost us 20% of our clients. We're not running with a 20% profit yet, so at best it'd mean an immediate layoff or across-the-board temporary paycut.

Luckily, because we're small we can run on a single LAMP server. We're working on making it so that we can migrate that to any region in EC2 with a single command, as well as making sure we can switch to a different dedicated hosting provider with minimal time.

mnutt · on Aug 12, 2015

By the way, turning on S3 bucket versioning is safe in that your objects will get served exactly the same over HTTP. However, with many of the AWS SDKs you will start receiving an S3::VersionedObject rather than an S3::Object. From there you can get the S3::Object but the VersionedObject does not have all of the object's properties.

helper · on Aug 12, 2015

I really hope we get a postmortem for the outages on Monday. S3 has historically been one of the most reliable AWS offerings so it will be interesting to hear what happened.

badmadrad · on Aug 12, 2015

Me too. We have noticed a general increase in error rates to S3 over the last month. I wouldn't be surprised if they were battling some ongoing issue that reached a tipping point.

ak217 · on Aug 12, 2015

It seems more likely that this fell out of recent advertised updates that offer read-after-write consistency in US STANDARD.

anh79 · on Aug 12, 2015

I'm thinking of putting S3 behind a Cloudflare set up, and use "Always online" feature of Cloudflare.

Sound goods? (Woh, as long as Cloudflare doesn't have any SSL issue :D)

gphil · on Aug 12, 2015

I think this would be a pretty good approach, especially when combined with the author's strategy.

Gigablah · on Aug 12, 2015

I imagine you'd have to implement some sort of cache priming as well?

jedberg · on Aug 12, 2015

FYI you can target the datacenter you want for S3's "standard" region and force it to always use Virginia by targeting s3-external-1.amazonaws.com

mmaedler · on Aug 12, 2015

As a bloody AWS newbe I wanted to clarify on one thing: You're talking about a Source S3 Bucket and a Destination Bucket. So in case the Source Bucket fails you also do your Writes against the Destination Bucket and they will get replicated once the former Source Bucket comes back online (two way sync)? Or am I mistaken something here? Thanks for clarification!

mnutt · on Aug 12, 2015

It actually looks like it may be possible to cross-sync back and forth between two buckets, but I haven't tried it. In our case we're ok with going into read-only mode for a bit.

lexicalscope · on Aug 12, 2015

It's amusing that they relied on a write followed by an immediate read to see if the updates were immediately consistent since S3 is only eventually consistent even if you're only using one region (with exception of certain utilities like the Import/Export tool) unless I'm missing something?

revertts · on Aug 12, 2015

There's read after write consistency for new objects, eventual consistency for overwriting existing objects.

Historically this applied to all regions except US Standard, but now that too supports it if you go through the VA instead of global endpoint.

lexicalscope · on Aug 12, 2015

That makes sense - in this article it looked like they were using a modification to test the consistency - which should always be eventually consistent though. Maybe I'm misunderstanding though? Regardless, interesting.

mnutt · on Aug 12, 2015

Yeah, I was operating under the assumption that it was eventually consistent and just found it curious that it converged way faster than I expected. (until I read the explanation)

lexicalscope · on Aug 13, 2015

That makes sense - thanks for clarifying.

ilkkao · on Aug 12, 2015

Is there rough estimate how much more (%) you need to pay if you duplicate all data but almost never access the copy.

mnutt · on Aug 12, 2015

It depends on how much your data gets accessed. Your data storage costs (~$0.03/GB) double, but your transfer out costs (~$0.05-$0.09/GB) stay the same. The replication cost is often pretty negligible compared to regular traffic.

aftbit · on Aug 13, 2015

Wait, doesn't putting this all behind Cloudfront make that a SPOF for your system?