Where were you when AWS went down?

2 minute read

Published:


It was 8.15 in the morning, and my stack of pancakes had just come off the skillet. I brought them to my little table by my bed and said, “Alexa, I’m eating.” It responded, “okay,” and did nothing. I said it again. “Okay,” Alexa said, and did nothing. My usual morning routine was suddenly broken, my smart lights were not responding to Alexa, and I was frustrated.

It took me two full hours after this experience until I found out what was happening, due to an unrelated incident which allowed me to connect the pieces together. When my class noticed that Canvas was down, that’s when I discovered that it was a MUCH larger issue: an outage of Amazon Web Services (AWS).

As I talked to more people and read up on some conversations on the Internet, I came to realize the scale of this outage and how many different services and groups of people it was affecting. Further reading and exploration on the web showed just how many services, such as Disney+, Netflix, Tinder, Venmo, CashApp, Xfinity, Verizon, and Amazon delivery systems, were affected. This led me to ask: how many users of these services were able to easily trace back the source of their issues to the AWS outage? More broadly, how much do we know about where our data goes and what processes it undergoes before we get results?

To me, this is a design flaw in most of these systems and services. A well-designed system should allow, according to Nielsen’s Heuristics, should allow users to adequately recognize, diagnose and recover from errors, across all types of errors. None of the systems that failed on me yesterday (my smart lights, Canvas and later my Amazon deliveries) informed me a) why the errors were occurring at a high level or b) offered me avenues to explore and diagnose the root causes.

Even if these errors are considered to be once-in-a-lifetime and definitely not part of the daily experience of interacting with these systems, it is worth building in information to address such situations. Not only would it bring less anxiety on a user having tried every possible solution available at their respective disposals, it would also be a more transparent effort in letting users know where their data streams flow and allow for informed consent when they choose to opt-in to such services.