Trino: The Data Synthesizer

TALK ABSTRACT

Have you ever heard of the musical instrument, the Synthesizer? It’s an electronic audio signal generator that produces a wide variety of sound by creating, modifying, or combining different tones to produce a variety of sounds. Sounds cool, right? By bringing together the sounds of multiple different sources, the result is something beautiful.

In today’s day and age, data is each organization’s most precious commodity and it is usually sprinkled throughout various storage solutions without any rhyme or reason as to why it landed there in the first place. As the cloud migration movement continues to gain traction, there are now even more possibilities of scattered data than ever before. The effort required to bring this data together is not trivial and usually requires hours of development time. But what if you had a synthesizer for your data?

I have excellent news - I’ve found you one. Trino is a massively parallel processing federated query engine that allows you to combine your data where it currently lives and serves as a single point of access for querying all your data. Just like a Synthesizer acts with sound, Trino can bring together and combine multiple data sources to create a beautiful result. Trino uses ANSI SQL to perform this query federation, so there is no learning curve with a specific proprietary language to start synthesizing.

Here are some of the other Trino use cases: Interactive data analytics where users can use ANSI SQL to query multiple data sources and decrease time to insight High Performance Analytics (data lake analytics) Batch ETL processing across disparate systems

Why should you care? Well, developer efficiency is the name of the game. Instead of data engineers or infrastructure engineers spending their time migrating data from one database to another just to fulfill a business requirement for a report, those engineers can utilize their time to actually work on new development or pertinent questions.

Trino runs wherever Kubernetes runs, and Kubernetes helps Trino scale up/scale down and support worker failover. With these two working together, your data can be synthesized in no time.

Speaker

Monica Miller

Former Data Engineer turned Developer Advocate now trying to make the lives of other data engineers easier. Dog mom. Reality TV aficionado. Passionate about Couch Driven Development.