Automated Root Cause Analysis

At Swisscom, we monitor thousands of applications and microservices across all departments. Through ML predictions and anomaly detection, we can quickly detect any problem and reduce MTTD. However, in such complex systems across a vast organisation, finding the root cause and minimizing MTTR can be like finding a needle in a haystack. This talk will demo how Automated Root Cause Analysis can leverage real-time dependency graphs, RAGs and LLMs to help investigators pinpoint which manual action triggered a cascading chain of degradation with customer impact.

Speaker

melchior-thambipillai

Melchior Thambipillai

 
Melchior Thambipillai holds a Master degree in CS from EPFL. He has worked in the monitoring/observability industry at Swisscom for over 5 years. He brings experience in ML and big data systems ...