After the dust has settled: learning from incidents

We’ve heard of post-mortem documents and incident review meetings but are they successful? Did we figure out the root cause or just cure the effects? Have we made things more brittle in the process? Might we have to repeat the steps in a month or three?

This talk will go over some real life examples of incidents and their remediations to try to answer some of those questions, and also the hardest part: getting the ball rolling.



Ajuna Kyaruzi

Ajuna Kyaruzi is a SRE & DevOps Advocate at Datadog and cares about using software to help people sustainably run large-scale systems, focusing on Incident Managements and SLOs. She loves ...