Listen "AI Positive"
Episode Synopsis
On the premiere episode of the AI Inside podcast, hosts Jeff Jarvis and Jason Howell discuss AI copyright issues with Common Crawl Foundation's Rich Skrenta regarding news outlets limiting access to content they publish publicly, impacting the integrity of Common Crawl's internet archive. In recent years, the archive has been used by LLMs as AI training data, and the implications of restricting information have a dramatic impact on the data quality that survives.
INTERVIEW
Introduction and background on AI Inside podcast
Discussion of the recent AI oversight Senate hearing Jeff testified at
Introduction of guest Rich Skrenta from Common Crawl Foundation
Overview of Common Crawl and its goals to archive the open web
Discussion of how Common Crawl data is used to train AI models
News publishers wanting content removed from Common Crawl
Debate around copyright, fair use, and AI’s “right to read”
Mechanics of how Common Crawl works and what it archives
Concerns about restricting AI access to data for training
Risk of regulatory capture and only big companies being able to use AI
Discussion of recent court ruling related to web scraping
Hopes for Common Crawl's growth and evolution
NEWS BITES
Interesting device announcement from CES - Rabbit R1 with Perplexity AI integration
Study on actual risk of AI automating jobs away in the near future
Learn more about your ad choices. Visit megaphone.fm/adchoices
INTERVIEW
Introduction and background on AI Inside podcast
Discussion of the recent AI oversight Senate hearing Jeff testified at
Introduction of guest Rich Skrenta from Common Crawl Foundation
Overview of Common Crawl and its goals to archive the open web
Discussion of how Common Crawl data is used to train AI models
News publishers wanting content removed from Common Crawl
Debate around copyright, fair use, and AI’s “right to read”
Mechanics of how Common Crawl works and what it archives
Concerns about restricting AI access to data for training
Risk of regulatory capture and only big companies being able to use AI
Discussion of recent court ruling related to web scraping
Hopes for Common Crawl's growth and evolution
NEWS BITES
Interesting device announcement from CES - Rabbit R1 with Perplexity AI integration
Study on actual risk of AI automating jobs away in the near future
Learn more about your ad choices. Visit megaphone.fm/adchoices
More episodes of the podcast AI Inside
What About... An OpenAI Bubble?
12/11/2025
OpenAI's Path to IPO
29/10/2025
Samsung Galaxy XR Gets Real
22/10/2025
OpenAI’s Sexy ChatGPT is Coming Soon
16/10/2025
Sora App Hands-On: We've Got Mixed Feelings
08/10/2025
OpenAI Sora 2 Ignites Likeness Debate
01/10/2025
Nvidia Powers OpenAI’s Future Now
24/09/2025
Nano Banana is a viral hit!
17/09/2025
Apple’s AI Silence Raises Big Questions
10/09/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.