Fitting Site Reliability & Developer Experience Into Team Priorities

This talk will focus on how you can use principles & metrics from Site Reliability teams and Developer Experience teams to create a culture of technical excellence, high velocity, and psychological safety.

When thinking about metrics that guide technical & team strategy, there are two underlying priorities that ideally should align towards one goal: system and engineering team health. Metrics relating to system health provide insight on how stable the user experience is, whereas engineering team health metrics provide insight on the experience of the engineers building these systems.

Site Reliability Engineering (SRE) defines the guiding principles and processes for ensuring system health, whereas Developer Experience (DUX) is less about the actual system and more about the tools, processes, and productivity levels related to the development cycle of that system.

Some notes I’ll touch upon in the talk include.

1.1.2. Event Logs & Traces: Being able to thoroughly understand the specifics of what’s happening in our system helps with improving system performance by providing logs and traces across different levels of our system. Granularity is also important here since thorough analysis of event logs and log metrics is needed to debug issues.

1.2.1 Better Insights: These different data pipelines should help us determine where bottlenecks or hotspots exist within our application so that metrics and prioritization frameworks can be efficiently determined.

1.3 Processes for Continuous Improvement

1.3.2 Stability vs Feature Velocity: Release engineering will bring questions about how we should balance feature development versus ensuring our systems are stable is a trade-off we’ll need to decide and be aligned on as a team. This balance can always be re-evaluated, but we should ensure that we make this balance transparent and have a process for revisiting trade-offs.

2.5.1. While there will likely be owners of ensuring we have necessary processes and metrics iteration to meet SRE metrics, shifting towards SRE culture both provides alignment as a team and an opportunity for us to collaborate as a team.

Speaker

Lesley Cordero

Lesley Cordero is currently a tech lead at an edtech company, Teachers Pay Teachers. She has spent the majority of her career on edtech teams as an engineer, including Google for Education and other edtech startups.