Listen "When AI Cannibalizes Its Data"
Episode Synopsis
Asked ChatGPT anything lately? Talked with a customer service chatbot? Read the results of Google's "AI Overviews" summary feature? If you've used the Internet lately, chances are, you've consumed content created by a large language model. These models, like DeepSeek-R1 or OpenAI's ChatGPT, are kind of like the predictive text feature in your phone on steroids. In order for them to "learn" how to write, the models are trained on millions of examples of human-written text. Thanks in part to these same large language models, a lot of content on the Internet today is written by generative AI. That means that AI models trained nowadays may be consuming their own synthetic content ... and suffering the consequences.View the AI-generated images mentioned in this episode.Have another topic in artificial intelligence you want us to cover? Let us know my emailing [email protected]!Listen to every episode of Short Wave sponsor-free and support our work at NPR by signing up for Short Wave+ at plus.npr.org/shortwave.Learn more about sponsor message choices: podcastchoices.com/adchoicesNPR Privacy Policy
More episodes of the podcast Short Wave
The trouble of zero
02/01/2026
Climate Anxiety Is Altering Family Planning
30/12/2025
Why Drones Are Catching Whale Breaths
26/12/2025
No, Raccoons Aren’t Pet-Ready (Yet)
22/12/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.