The Sky is Not Falling
Amazon’s S3 service was down this morning, from around 4:30 AM PST through 7:17 AM PST.
There are lots of upset people on the developer forum. Services like Twitter, SmugMug and Tumblr that host resources on S3 suffered some pain. ZDNet posted a predictable "sky is falling" piece. AlleyInsider has a more balanced view. There’s more on Technorati if you want it.
Unfortunately, very little official comment on the forum so far from Amazon. That’s a mistake I’m sure they won’t repeat. I sympathize. It’s tough maintaining the presence of mind to post public status reports when everything is going haywire around you. Nevertheless, you need to, when running a service like this.
Some voices on the forum are saying this proves you need a fallback option for those days (hours) when S3 is down. Unless you’re running a service that’s essential (i.e. literally vital) to your customers, that’s crap.
The whole point of services like S3 is that they are reliable and cost-effective enough for you to build your business on top of them, using small increments of cash (buying 1GB at a time) instead of big increments (buying 1 server at a time) that result in overbuilding.
Will Amazon run its services at 100% uptime? No. But I bet their uptime will be proven better than almost any smaller company’s uptime.
Should you backup your data and store it somewhere offsite, for disaster recovery purposes, auditing, and so on? Yes.
Should you build another whole operations center — hardware, electricity, bandwidth, security, staff — and leave it idling just in case your main ops goes down? No!
There’s a great analogy here to other utility services such as roads, transit, electricity, water, and medical care. Essential services try to build out a small amount of reserve capacity for dealing with emergencies. Non-essential services cannot afford it, and so when they fail, they fail completely. And this is the way it should be. Can you imagine what all this infrastructure would cost, otherwise, and how much resource we would waste in building and maintaining it?
Perhaps there will come a day when we’re simply unable to survive for even a few desparate hours without Twitter serving up avatar images, or Blackberry zapping us with new email messages. But until then… them’s the breaks.