Driving Reliability at Scale with Platform Engineering and SRE

Site reliability engineering (SRE) is a discipline founded at Google that is now widely practiced across the tech industry. SRE represents a set of principles and practices that applies aspects of software engineering to IT infrastructure and operations.

Codifying SRE principles and practices into a production platform helps organizations standardize and scale with less overhead and toil. The key challenge remains: how to shift an organization’s culture to embrace the platform and reap the benefits of it.

In this talk, we will discuss the key principles and practices of SRE, and how they can be used to build high-performance software and teams. We’ll explore insights from the State of DevOps Report and how SRE can help foster the type of generative organizational culture that is a hallmark of high performing organizations. We’ll discuss how culture shift is really about building confidence and how learning programs can be a strategic vehicle to achieve this.

Takeaways: * Learning drives culture: Learning drives confidence, confidence drives behavior and behavior repeated over time drives culture. * Quantifiable Results: Case studies demonstrate the tangible impact of SRE-based training on confidence and culture, leading to improved outcomes. * Practical Applications: Learn how to apply SRE principles to your training program, including setting clear objectives, creating relevant and engaging content and leveraging monitoring and feedback to optimize the learning experience.

Speaker

jennifer-petoff

Jennifer Petoff

   

Jennifer Petoff is Director of Google Cloud Platform (GCP) & Technical Infrastructure (TI) Education and is based in Lisbon, Portugal. She leads training programs for Google’s GCP and TI

...