Listen "Benchmarking Domain Intelligence | Data Brew | Episode 45"
Episode Synopsis
In this episode, Pallavi Koppol, Research Scientist at Databricks, explores the importance of domain-specific intelligence in large language models (LLMs). She discusses how enterprises need models tailored to their unique jargon, data, and tasks rather than relying solely on general benchmarks.Highlights include:- Why benchmarking LLMs for domain-specific tasks is critical for enterprise AI.- An introduction to the Databricks Intelligence Benchmarking Suite (DIBS).- Evaluating models on real-world applications like RAG, text-to-JSON, and function calling.- The evolving landscape of open-source vs. closed-source LLMs.- How industry and academia can collaborate to improve AI benchmarking.
More episodes of the podcast Data Brew by Databricks
Multimodal AI | Data Brew | Episode 42
07/04/2025
Age of Agents | Data Brew | Episode 41
27/03/2025
Reward Models | Data Brew | Episode 40
20/03/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.