Operating within Normal Parameters: Monitoring Kubernetes

After Kubernetes takes over your data centers, how can you be sure that it’s operating within normal parameters? What does “normal” even mean? By formalizing your expected quality of service, you can measure and compare against known targets with open source tools like Prometheus. In this talk, we’ll use Kubernetes as a case study for introducing service level objectives (SLOs) to guide monitoring efforts. Come learn the how and why of metric selection for monitoring Kubernetes quality of service, what gaps exist in the open source Kubernetes monitoring ecosystem, how to use Prometheus and its exporters to establish predictability and “normal” baselines, and how to use this telemetry to debug service degradations in a Kubernetes cluster.

Speaker

Elana Hashman

Red Hat

Elana Hashman currently works for Red Hat as a Principal Site Reliability Engineer, serving as a technical lead on the Azure Red Hat OpenShift managed service. She is a member of the

...