“See Problems as They Occur” is the 1st Principle of Feedback in the DevOps Handbook. Reducing operational risk, seeing problems as they occur, and continuously improving IT systems and applications requires making them Observable. But achieving Observability means generating more data than people can reasonably handle. How can we enhance visibility when there is just too much we need to see? Enter the DevOps mantra, “Automate Everything You Can”!
In this session, we will delve into AIOps as the answer to the Observability dilemma. We will start by surveying the variety of data that is required to enable Observability and the automation that is necessary to generate and capture it.
We will then look at how automation can help us to use that data, not just to see problems as they occur, about also to get out ahead of them. We will discuss using Machine Learning to make sense of vast amounts of IT operational data and separate what is actionable from what is merely interesting.
And finally, we will look at driving toward artificial intelligence: automation that can correct and even prevent problems without human intervention. Self-healing systems (an aspiration we’ve had for decades) is finally coming within reach!