Building Enterprise RAG: Lessons from 2+ Years of Production Deployments

01/07/2025 37 min Episodio 4

Listen "Building Enterprise RAG: Lessons from 2+ Years of Production Deployments"

Descargar episodio Ver en sitio original

Episode Synopsis

Building production AI systems is hard — especially when you're pioneering entirely new categories. In this episode, Yuval speaks with Guy Becker, Group Product Manager at AI21, to trace the evolution from task-specific models to Agent planning and orchestration systems. Guy shares hard-won lessons from building some of the first RAG-as-a-service offerings when there were literally zero handbooks to follow.
Key Topics:
Task-specific models vs. general LLMs: Why focused, smaller models with pre and post-processing beat general purpose LLMs for business use cases.
Building RAG before it was cool: Creating one of the first RAG-as-a-service platforms in early 2023 without any established patterns.
The one-size-fits-all problem: Why chunking strategies, embedding models, and retrieval parameters need customization per use case.
From SaaS to on-prem: Scaling deployment models for enterprise customers with sensitive data.
When RAG breaks down: Multi-hop queries, metadata filtering, and why semantic search isn't always enough.
Multi-agent orchestration: How AI21 Maestro uses automated planning to break complex queries into parallelizable subtasks.
Production lessons: Evaluation strategies, quality guarantees, and building explainable AI systems for enterprise..

More episodes of the podcast YAAP (Yet Another AI Podcast)

The House That Builds Builders – The Origin Story of AGI House 11/11/2025

Scraping Without Getting Sued (Or Falling Asleep) 28/10/2025

The Judge Model Diaries: Judging the Judges 26/08/2025

RLVR Lets Models Fail Their Way to the Top 12/08/2025

RAG Is Not Solved – Your Evaluation Just Sucks 29/07/2025

The Call Is Coming From Inside the Agent (And It Has Your Credentials) 15/07/2025

Trailer 19/06/2025

You Can’t Have an Agent Without a Plan: What 90% of ’Agents’ Are Missing 17/06/2025

The Hard Truths About AI Agents: Why Benchmarks Lie and Frameworks Fail 10/06/2025

Tool Calling 2.0: How MCP Is Standardizing AI Connections 29/05/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Building Enterprise RAG: Lessons from 2+ Years of Production Deployments

Listen "Building Enterprise RAG: Lessons from 2+ Years of Production Deployments"

Episode Synopsis

More episodes of the podcast YAAP (Yet Another AI Podcast)

Localhost, there’s no place like 127.0.0.1

Educational Technology: From traditional to digital

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD