37. Sean Knapp - The brave new world of data engineering

Published: June 10, 2020, 2:57 p.m.

b'

There\\u2019s been a lot of talk in data science circles about techniques like AutoML, which are dramatically reducing the time it takes for data scientists to train and tune models, and create reliable experiments. But that trend towards increased automation, greater robustness and reliability doesn\\u2019t end with machine learning: increasingly, companies are focusing their attention on automating earlier parts of the data lifecycle, including the critical task of data engineering.

\\n

Today, many data engineers are unicorns: they not only have to understand the needs of their customers, but also how to work with data, and what software engineering tools and best practices to use to set up and monitor their pipelines. Pipeline monitoring in particular is time-consuming, and just as important, isn\\u2019t a particularly fun thing to do. Luckily, people like Sean Knapp\\u200a\\u2014\\u200aa former Googler turned founder of data engineering startup Ascend.io\\u200a\\u2014\\u200aare leading the charge to make automated data pipeline monitoring a reality.

\\n

We had Sean on this latest episode of the Towards Data Science podcast to talk about data engineering: where it\\u2019s at, where it\\u2019s going, and what data scientists should really know about it to be prepared for the future.

'