Resolving Outages Faster with Better Debugging Strategies




Engineers spend a lot of time building dashboards to improve monitoring but still spend a lot of time trying to figure out what’s going on and how to fix it when they get paged. Building more dashboards isn’t the solution, using dynamic query evaluation and integrating tracing is.

Speakers

liz-fong-jones

Liz Fong-Jones

  
Liz is a Staff Site Reliability Engineer at Google and works on the Google Cloud Customer Reliability Engineering team in New York. She lives with her wife, metamour, and two Samoyeds in Brooklyn. In ...adam-mckaig

Adam Mckaig

  
Adam Mckaig is an SRE at Google in New York, where he looks after a monitoring system. Previously he built things at the New York Times, Bloomberg, and UNICEF. He enjoys C++, which probably says it all.