Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's an AWS article explaining how root cause analysis is done at Amazon: https://aws.amazon.com/elasticsearch-service/resources/artic...

> The technique consists of asking the question “Why?” iteratively until you get to the root of the problem. Let’s see a quick example:

Problem: The website is showing error 500.

1. Why? Because the web framework’s routing component malfunctioned.

2. Why? Because it requires another component, which itself malfunctioned.

3. Why? Because this component of the web framework requires the intl extension, which isn’t working.

4. Why? Because it was accidentally deactivated after the server software got updated.



worth noting that there are reasonable criticisms of the 5 why's method. complex system failures often do not have a single root cause.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: