Designing Pragmatic Observability that Works: Avoiding Pitfalls

Running production workloads that fulfil the business requirements is a non-trivial task. As a result, as the SysAdmin, DevOps, SRE, Platform Engineer or Developer, you need robust observability to understand how workloads you are responsible for are behaving in the desired environment.

Industry developed many solutions and methodologies for observing and monitoring software. As a result, we have a variety of open-source projects, standards and an even larger amount of different vendors that can help you with observability tooling. Unfortunately, more choices are not always better. Nowadays observability space can be very confusing, especially combined with ever-changing requirements and different kinds of workloads you might want to observe. For example, think about functions in a Serverless world, containers on managed Kubernetes, virtual machines in the cloud or just bare-metal servers.

In this talk, you will learn:

What matters when designing an Observability system for cloud-native workloads as well as monoliths.
How to avoid buzzwords, enormous bills, disruption and other pitfalls.
Trends and solutions.

All explained by Bartek, Principal Software Engineer @ Red Hat in Observability Group, the CNCF TAG Observability Tech Lead, core maintainer of CNCF open-source projects like Prometheus and Thanos and O’Reilly author. Learn what observability patterns typically works (and which work less), in a straightforward way, from a fellow engineer

Speaker

Bartłomiej Płotka

Bartek Plotka is a Principal Software Engineer at Red Hat with a background in SRE and is currently working on Openshift monitoring. He is the co-author and core maintainer of the Thanos project, ...