On the morning of February 28, 2017, large portions of the internet lost a critical service they rely on. The US-East-1 region of Amazon’s Simple Storage Service (S3) went down for about five hours. It’s not necessary to know much about US-East-1, other than the fact that it’s heavily used by many sites and apps. Right at the end of February, this was the internet equivalent of a snow day.
Amazon, in addition to offering its huge inventory of consumer products, also offers a massive list of developer services under its Amazon Web Services (AWS) brand. We rely on these services to make our work easier, and our clients rely on them to keep things running smoothly with little overhead and low cost. These types of services are “the cloud” that you’ve often heard of from tech advertising and marketing. This list shows the wide range that AWS covers. S3 is just one such service that we integrate with on our projects.
For a simple explanation of what S3 is, imagine it as a hard drive for your app or website. It’s a storage place for things your app might use, separate from the code running the app. When a user uploads an image via the app, for example, the image would get stored on a server in S3. When the image is needed elsewhere in the app, S3 makes it easy and lightning fast for the app to retrieve the image from the server for use or display.
As a client or developer, Amazon handles all the details, so you never have to worry about running out of storage space or running out of bandwidth to access the assets you’re storing. Pricing is incredibly affordable, with the idea being that as your user base grows and your needs scale, though it becomes more expensive, the cost will always stay affordable enough for your project to make much more than enough money to pay the S3 bill.
Some estimates have the number of sites using S3 at just under 150,000. There’s no knowing just how many of those relied on the US-East-1 region, but when S3 went down, a large number of sites were impacted by the outage. Other AWS services that relied on S3 had issues, so even sites not using S3 directly were impacted. The outage broke a fair bit of the internet, and lots of sites had issues because they were unable to upload or retrieve assets stored on S3.
Though similar services from competitors like Microsoft and Google exist, there simply isn’t a better solution to the general asset storage problem that S3 solves. Those other services aren’t even close in terms of simplicity and maturity. S3 is not a new tool and has been carefully refined and improved over the years by Amazon to be something genuinely irreplaceable. Building this kind of infrastructure into the project itself is often time consuming and expensive, and likely won’t perform at the level offered by S3. In fact, S3 is so stable that preparing for this exceedingly rare case might not prove financially feasible, but each project and business has different needs.
When affordable, we might recommend that data duplication be used to allow for a backup region (or two), so that if one goes down, another can be used from somewhere else around the world. Slow data transfer speeds that succeed are better than uploads or downloads that completely fail, after all.
S3 is a fantastic tool that we use to build amazing products for our clients. It will continue to be and fantastic tool, and Amazon has learned a lot about how to keep this from happening again. You can read their post mortem write up here.
We will continue to recommend S3 to our clients, including those who may not have the budgets to build redundancy solutions in the event of another internet snow day. Despite the rare hiccup, AWS is the gold standard when it comes to web development.