An outage occurred in AWS in 2011, which affected many services but not Netflix.

Sukhad Anand
2 min readDec 13, 2021

--

Let’s see how they did it ???

Photo by Thibault Penin on Unsplash

1. Netflix used a stateless-service architecture. This means that any server can serve any request. Even if, one of the nodes failed, a new node can be easily spun up to serve the requests.

2. Instead of depending on one single zone, and storing data there only, they kept multiple copies in different zones. In case of failure, any new zone could be tried for the same data.

3. Netflix uses a technique of graceful degradation which is based on three principles: 1) Fail-fast- Aggressive timeouts, so that dying systems are caught early.2) Feature Fallbacks — If one feature fails, its fallback will be used (There is hard coding for every error scenario). 3) If the feature is slow and uncritical, that feature is removed from the page.

4. Netflix uses “n+1” redundancy which means they have more nodes, than required to serve the traffic. This helps them serve the requests during peak hours too.

5. To fully embrace the cloud, they rearchitected their system to use the new technologies. They made heavy use of S3 as their data source. AWS S3 is resilient for zone failures and is highly reliable.

6. Even after all this, they still faced some issues like- manually transferring the traffic to other zones, and making sure that the traffic across other zones was equally distributed.

7. Netflix did not stop here. They tried to make their system more resilient. They created a service called “Chaos Monkey” which generates failures and kills other services. After that they monitor, that killed services should be able to recover automatically without manual intervention.

8. They automated load distribution in the case of a zone failure to prevent manual intervention.

Some of these things may seem obvious today, but this was 2011 and this must have been extremely challenging.

Their official blog: https://lnkd.in/e6drkRgx

--

--

Sukhad Anand
Sukhad Anand

Written by Sukhad Anand

Addicted to 007 movies and music of all genres and all generations. A bit of philosophy with a pinch of music and a handful of coding.

No responses yet