If you have a data science team working with you, and haven’t yet required to provide them a GPU (Graphical Processing Unit) production cluster, surely it’s going to happen soon and you want to know how to respond and what will it take from you. In this talk I’ll make an intro to GPU basing on my experience building a production auto-scalable cluster of GPUs for inference of a deep learning model. We’ll understand what is a GPU, why is that important, what does it mean to scale, monitor and develop using this exciting infra.