DevOps trends are clear on measuring systems Mean Time To Recovery rather than Mean Time Between Failures. I argue that worrying about time between failures actually causes more harm than worrying about recovery. But do we think of our human systems the same way as our digital? I’ll apply lessons learned in SysOps to HumanOps.
I’ll talk about how our complex social systems act like complex computer systems and how focusing on MTTR rather than MTBF is a good thing between people, not just machines. I'll cover the environmental requirements for focusing on MTTR and discuss potential conflict resolution steps for a jumping off point in your organization or community.
