Listen "What is data curation and why does it matter?"
Episode Synopsis
This week we’re going back down the AI rabbit hole, but we’re venturing down a new tunnel to talk about something called data curation. Though AI is still a developing technology, it’s well enough known at this point that models are only as good as the data they’re trained on. But for enterprises looking to fine tune publicly available models, it can be a challenge to make sure they’re making the right data available. Why? Well, the vast majority of enterprise data is what is known as unstructured data. That includes any data that’s not numeric – photos, videos, emails, PDFs, you name it. Enter data curation – which is basically just the process of sorting through all this data to decide what is relevant to train the model and what’s not. Today this is mostly a tedious, manual process. But is it even worth the hassle? We spoke to Vincent Chen, Director of Product and Founding Engineer at Snorkel AI to get the lowdown on how data curation works, why it matters and whether it’s worth the hassle. To learn more about the topics in this episode: Snorkel AI dives into hot market of data curation https://www.fierce-network.com/cloud/snorkel-ai-dives-hot-market-data-curation Data storage gets spicy with help from AI https://www.fierce-network.com/ai/data-storage-gets-spicy-help-ai GenAI could illuminate decades worth of dark data https://www.fierce-network.com/cloud/unstructured-data-pandoras-box-genai-its-key See omnystudio.com/listener for privacy information.
More episodes of the podcast The Five Nine
The Five Nine: Telecom takes a quantum leap
24/10/2025
The Five Nine: Will AI save or sink telcos?
10/10/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.