Silent, but Deadly - Production End to End Testing

Microservices are hard. Distributed systems are harder. Knowing customers are facing issues before they call your support team seems nearly impossible. What's worse is these failures can persist for long periods of times while all conventional patterns of testing indicate things are O.K. Detecting these failures early is mission critical.

What started as no monitoring at all changed to "that bash script engineer X wrote" all the way to a system of containers routinely checking system health has been a journey spanning half a decade.

This talk will explore how we detect complex failures with end-to-end testing using container orchestration. The talk will also discuss why we do end-to-end testing as well as dive into the history of end to end testing at my company including what went well, what didn't, and surprises along the way.

Attendees of this talk will rethink how to implement an effective end-to-end testing suite for their infrastructure. They will be presented with the tooling and practices to minimize cognitive load of end-to-end testing as well as patterns for identifying failures before their customers.



Peter Kennedy


Canadian Software engineer at PagerDuty out of San Francisco focusing in distributed systems, chaos engineering and that wonderful intersection between. Peter has been active in the devops community