How a Moonshot Led to Google DeepMind's Veo 3

16/10/2025 48 min Episodio 16
How a Moonshot Led to Google DeepMind's Veo 3

Listen "How a Moonshot Led to Google DeepMind's Veo 3"

Episode Synopsis

Dumi Erhan, co-lead of the Veo project at Google DeepMind, joins host Logan Kilpatrick for a deep dive into the evolution of generative video models. They discuss the journey from early research in 2018 to the launch of state-of-the-art Veo 3 model with native audio generation. Learn about the technical hurdles in evaluating and scaling video models, the challenges of long-duration video coherence and how user feedback is shaping the future of AI-powered video creation.Chapter: 0:00 - Intro0:47 - Veo project's beginnings3:02 - Veo's origins in Google Brain5:07 - Video prediction and robotics applications7:45 - Early progress and evaluation challenges10:30 - Physics-based evaluations and their limitations12:18 - The launch of the original Veo model14:06 - Scaling challenges for video models16:02 - The leap from Veo1 to Veo219:40 - Veo 3’s viral audio moment21:17 - User trends shaping Veo's roadmap23:49 - Image-to-video vs. text-to-video complexity26:00 - New prompting methods and user control27:55 - Coherence in long video generation31:03 - Genie 3 and world models35:54 - The steerability challenge41:59 - Capability transfer and image data's role47:25 - Closing