#22 Why testing data pipelines can be so challenging - and how to tackle it

Published: Sept. 6, 2024, 2 p.m.

In this episode of the Plumbers of Data Science podcast, I\u2019m diving into why testing can be so challenging for data engineers. The inspiration for this topic actually came from one of my recent Coaching sessions, where the question of test-driven development (TDD) came up during a Q&A. It stuck with me, so I thought it would be a great topic to dive deeper into.

\n

I\u2019ll explain the key benefits of TDD, like improved code quality and easier refactoring, and why, despite its advantages, it\u2019s not always widely adopted\u2014especially in fast-paced environments where time constraints dominate. We\u2019ll also talk about the specific challenges data engineers face with TDD, such as handling large, unpredictable data, integrating with external systems, and adapting to ever-changing data.