Listen "659: Open-Source Tools for Natural Language Processing"
Episode Synopsis
NLP practitioners: this episode is for you. From the awareness of linguistic elements and annotation to getting the necessary people in the room, Vincent Warmerdam presents to Jon Krohn a recipe for a successful project and the open-source NLP tools to get there.
This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:• How Vincent came to work with De Speld [08:57]• Vincent’s role at Explosion [18:59]• How users can apply spaCy [21:46]• Prodigy: Annotate training data more efficiently with scripts [26:28]• How to manage “skill anxiety” with Calmcode [32:32]• How Vincent fixed bad labels [42:47]• The value of understanding linguistics for NLP [54:42]• How to constrain artificial stupidity [1:02:38]
Additional materials: www.superdatascience.com/659
This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn:• How Vincent came to work with De Speld [08:57]• Vincent’s role at Explosion [18:59]• How users can apply spaCy [21:46]• Prodigy: Annotate training data more efficiently with scripts [26:28]• How to manage “skill anxiety” with Calmcode [32:32]• How Vincent fixed bad labels [42:47]• The value of understanding linguistics for NLP [54:42]• How to constrain artificial stupidity [1:02:38]
Additional materials: www.superdatascience.com/659
More episodes of the podcast Super Data Science: ML & AI Podcast with Jon Krohn
953: Beyond “Agent Washing”: AI Systems That Actually Deliver ROI, with Dell’s Global CTO John Roese
30/12/2025
952: How to Avoid Burnout and Get Promoted, with “The Fit Data Scientist” Penelope Lafeuille
26/12/2025
948: In Case You Missed It in November 2025
12/12/2025
946: How Robotaxis Are Transforming Cities
05/12/2025
945: AI is a Joke, with Joel Beasley
02/12/2025
944: Gemini 3 Pro: Google’s Back on Top
28/11/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.