We’ve all been there - you’ve made sprint commitments, projects are in flight and waiting on you, and you are already managing the consequences of a high-criticality outage from last week, when a new onboarding cohort starts and your team is overwhelmed by operational work, turning on new accounts and resolving access issues.
Balancing operational work is a struggle for any team, but it is a particular and ongoing struggle for SRE, where operational workload is expected to take up a large and ongoing percentage of work time. Left unchecked, it will overwhelm project work and eclipse the ability of the team to write automation or creatively solve problems, leading to more ongoing toil and eventually regressing the team from a Devops approach back towards an operations/infrastructure role.
In this session I’ll discuss learnings from several prior teams about how to manage and balance that load, and some key underlying principles around team structure and work intake. I’ll discuss how I’ve applied Team Topologies structures to ensure work streams don’t get accidentally crossed, and some key learnings about the role of an SRE manager in ensuring that operational and project work can both be accounted for by your team.