SLOwly Burning Out - Avoiding Common Pitfalls When Setting SLOs

As systems reach production, the value it provides to a customer can become a focus of engineering teams holding a pager by setting relevant SLOs and responding to alerts. However, as systems change over time, they may gain more points of failure, increase in complexity, or customers may simply use the system differently.

If SLOs aren’t kept up-to-date, teams can find themselves responding to more and more alerts that are increasingly hidden from customers. Even the best teams can find themselves firefighting toilsome alerts and without time to improve the system’s as a whole.

Based on a true story, in this talk you will learn about pitfalls encountered when setting SLOs and how these pitfalls directly impacted the day-to-day developer experience of engineers and the systems being worked on at Red Hat. We’ll also discuss how avoiding and climbing out of these pitfalls can bring about a better understanding of the system, reducing burn out.

Speaker

michael-shen

Michael Shen

 
Michael Shen has spent almost his entire life in the Ann Arbor area and has been happily been along for the ride on the DevOps movement since starting his career 7 years ago. He currently works as a ...