635: The Perils of Manually Labeling Data for Machine Learning Models

Published: Dec. 13, 2022, noon

Hand labeling data and information bias: Jon Krohn speaks with Watchful CEO Shayan Mohanty about the pitfalls of data analysis when bias comes into the equation (spoiler alert: it always does), the importance of the Chomsky hierarchy in data management, and the importance of simulation engines for returning real-time results to users.\nThis episode is brought to you by Iterative (iterative.ai), your mission control center for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.\nIn this episode you will learn:\u2022 Why bias in general is good [04:06]\u2022 The arguments against hand labeling [09:47]\u2022 How Shayan solves the problem of labeling at his company [24:26]\u2022 Misconceptions concerning hand-labeled data [43:25]\u2022 What the Chomsky hierarchy is [52:38]\u2022 Watchful\u2019s high-performance simulation engine [1:04:51]\u2022 What Shayan looks for in his new hires [1:08:15]\nAdditional materials: www.superdatascience.com/635