Web Scrapping Woes For Gen AI Platforms

17/10/2023 9 min

Listen "Web Scrapping Woes For Gen AI Platforms"

Descargar episodio Ver en sitio original

Episode Synopsis

In this enlightening episode, hosts Andrew Davis and Chris Branch delve into the significant realm of web scraping, an essential process for training generative AI models. They discuss the recent trend of websites locking down their backends to prevent data scraping, with a notable mention of the BBC's move to block third-party platforms from accessing its content. The conversation evolves into a deeper discussion on the potential siloing of information and its impact on AI development, echoing societal echo chambers observed in social media platforms. The hosts contemplate the broader implications, likening the scenario to the subscription model dilemma faced in streaming services, and emphasize the importance of diverse data for a more impartial AI representation. As they wrap up, the uncertainty of the situation underscores the evolving landscape of data accessibility in the AI domain.

More episodes of the podcast In A(i) Nutshell

Fantasy Football To Police Officers Turning Into Frogs (AI News) 09/01/2026

My 2026 New Year Resolutions 08/01/2026

My Big Picture AI 2026 Predictions 07/01/2026

Predicting Cool Tool Features For 2026 06/01/2026

What I'm Looking Forward To In AI This January 05/01/2026

Wiiners, Losers and Thank You 30/12/2025

The Top 3 AI Tools I Have Been Using In December 29/12/2025

What I Would Like Santa To Get Me For Christmas 24/12/2025

My Top 5 Tools Of 2025 23/12/2025

My Top 5 Highlights Of 2025 22/12/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Web Scrapping Woes For Gen AI Platforms

Listen "Web Scrapping Woes For Gen AI Platforms"

Episode Synopsis

More episodes of the podcast In A(i) Nutshell

WWW. Is it obsolete or not? Should we use it?

7 Advices to Prevent Identity Theft

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD