If you do not have an Error Budget, chances are you are wasting resources chasing a mythical SLA that has no defined business driver.
Site Reliability Engineering practices and tooling can reclaim much of these costs and resources, but only after there is organizational awareness that they exist.
“Five Nines” refers to the five nines in 99.999% available that is often synonymous with highly available.
Does every highly available service require five nines? Not by a long shot.
Yet the general state of the practice is to chase after this typically unrealistic goal almost blindly in many cases, often leading to unnecessarily high costs in both operational and development resources.
Even less aggressive availability goals are often over-specified compared to true business drivers.
This talk will cover:
Applying these techniques should result in a more cost-effective service that keeps end users and management happy, and fewer alerts to the on-call DevOps engineer.