At Swisscom, we monitor thousands of applications and microservices across all departments. Through ML predictions and anomaly detection, we can quickly detect any problem and reduce MTTD. However, in such complex systems across a vast organisation, finding the root cause and minimizing MTTR can be like finding a needle in a haystack. This talk will demo how Automated Root Cause Analysis can leverage real-time dependency graphs, RAGs and LLMs to help investigators pinpoint which manual action triggered a cascading chain of degradation with customer impact.
