Building real-time voice applications with Live API

06/08/2025 40 min Episodio 12
Building real-time voice applications with Live API

Listen "Building real-time voice applications with Live API"

Episode Synopsis

Shrestha Basu Mallick, one of the product leads for the Gemini API, joins host Logan Kilpatrick for a deep dive of Gemini Live API, Google’s real-time, multimodal interface for developers. Learn about how native audio alongside new capabilities like proactive audio and async function calling unlocks the unique power of audio as an interface.Watch on YouTube: https://www.youtube.com/watch?v=4xlwlU6h-wM0:00 - Intro1:18 - Live API Overview3:36 - Why audio is a special modality5:07 - Speed vs. precision in audio6:17 - Controllable and promptable TTS8:31 - What developers are building with the Live API11:14 - URL context and async calling features15:02 - Proactive audio and affective dialog16:55 - Addressing developer feedback21:54 - Live API roadmap23:49 - The role of long context24:57 - What’s next for the Live API26:41 - State of the AI audio market30:10 - Advice for developers getting started with the Live API31:16 - Live API demo38:10 - Demo wrap up and closing