Listen "Building real-time voice applications with Live API"
Episode Synopsis
Shrestha Basu Mallick, one of the product leads for the Gemini API, joins host Logan Kilpatrick for a deep dive of Gemini Live API, Google’s real-time, multimodal interface for developers. Learn about how native audio alongside new capabilities like proactive audio and async function calling unlocks the unique power of audio as an interface.Watch on YouTube: https://www.youtube.com/watch?v=4xlwlU6h-wM0:00 - Intro1:18 - Live API Overview3:36 - Why audio is a special modality5:07 - Speed vs. precision in audio6:17 - Controllable and promptable TTS8:31 - What developers are building with the Live API11:14 - URL context and async calling features15:02 - Proactive audio and affective dialog16:55 - Addressing developer feedback21:54 - Live API roadmap23:49 - The role of long context24:57 - What’s next for the Live API26:41 - State of the AI audio market30:10 - Advice for developers getting started with the Live API31:16 - Live API demo38:10 - Demo wrap up and closing
More episodes of the podcast Google AI: Release Notes
Gemini 3 and Gen UI in Google Search
18/12/2025
Gemini 3: Launch day reactions
25/11/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.