One of my fears is that I mess up my job, and it somehow has enough impact to end up on television. Nearly ten years ago, that fear materialized when a simple network change turned into one of the most stressful hours that I experienced. One that left such an impact that I can still recall most details all these years later. For which the upside is that I can do a full postmortem with the hindsight of all the lessons I learned in the past decade.
In this talk I recount the events of a midnight maintenance session gone wrong. I will share the lessons I learned about resilient network design, about the importance of team members that have your back, and about all the things that went right and wrong on that night. And I’ll explain why I am now an advocate for doing maintenance during office hours.