Data science for Effective Operations

Abstract

Gathering telemetry data is key to operating reliable distributed systems at scale. Data Science is the art of extracting information from large amounts of data. In this workshop, we will cover a range of analysis methods, that are know to work well for IT operations data. You will deepen your understanding of the inner workings of these methods, and get to apply them with a modern data science environment based on the Python/Jupyter toolchain.

Format

The course will be split 50 / 50 into lectures and labs. In the lectures I'll explain mathematical aspects on the whiteboard, walk through some Jupyter notebooks and demo some live examples from our own monitoring. In the labs, you will have get your hands dirty analysing provided datasets with Python/Jupyter yourself.

Topics include

Visualising Data
Mean Values
Deviation Measures
Outliers
Percentiles
Histograms
Regressions

Learning Goals

Get a good overview of how you extract value from operations data
Deeper understanding of aggregation and analysis methods, commonly found in monitoring tools (averages, percentiles, regressions), and their pitfalls
Get started with the Python/Jupyter data science toolchain

Intended Audience

Developers with mathematical Interest
Operations Engineers with mathematical Interest
People interested in learning data science by playing with operations data

Expected Prior Knowledge

Read and write Python code
Interest in Mathematics. You will only need basic arithmetic for the exercises. But you will have to endure more complex calculations presented on the whiteboard.

Warning

This course contains mathematics