Listen "How DeepSeek Is Beating OpenAI at Their Own Game—On a Budget"
Episode Synopsis
In this episode of IA Odyssey, we unpack how DeepSeek's open-source models are shaking up the AI world—matching GPT-level performance at a fraction of the cost. Drawing on insights from the research paper by Chengen Wang (University of Texas at Dallas) and Murat Kantarcioglu (Virginia Tech), we explore DeepSeek's secret sauce: memory-efficient Multi-Head Latent Attention, an evolved Mixture of Experts architecture, and reinforcement learning without supervised data. Oh, and did we mention they trained this monster on a $ave-the-GPU budget?From hardware-aware model design to the surprisingly powerful GRPO algorithm, this episode decodes the magic that’s making DeepSeek-V3 and R1 the open-source giants to watch. Whether you're an AI enthusiast or just want to know who's giving OpenAI and Anthropic sleepless nights, you don’t want to miss this.Crafted with help from Google's NotebookLM.Read the full paper here: https://arxiv.org/abs/2503.11486
More episodes of the podcast AI Odyssey
Will Your Next Prompt Engineer Be an AI?
01/11/2025
Beyond the AI Agent Builders Hype
11/10/2025
AI That Quietly Helps: Overhearing Agents
04/10/2025
AI's Guessing Game
20/09/2025
From Search Buddy to Personal Agent
13/09/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.