Evaluate LLM-based chatbots performance [Microsoft]

21/04/2025 8 min Episodio 82

Listen "Evaluate LLM-based chatbots performance [Microsoft]"

Descargar episodio Ver en sitio original

Episode Synopsis

In this episode, we will explore why evaluating LLM-based chatbots is critical for businesses, the limitations of traditional evaluation methods, and what could be a good robust evaluation framework covering both search performance and LLM-specific metrics. For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/data-science-at-microsoft/evaluating-llm-based-chatbots-a-comprehensive-guide-to-performance-metrics-9c2388556d3e

More episodes of the podcast Snacks Weekly on Data Science

Product Recommendations with LLMs and Word2Vec [CVS Health] 12/01/2026

Building AI Agents at Airtable [Airtable] 05/01/2026

Quick Thoughts and Reflections at the End of 2025 29/12/2025

Real-time Spatial and Temporal Forecasting [Lyft] 22/12/2025

GenAI Solution for Invoice Document Processing [Uber] 15/12/2025

Optimize Web Performance [Walmart] 08/12/2025

Understanding Metric Movement with Root Cause Analysis [Pinterest] 01/12/2025

Improving Search Ranking for Maps [Airbnb] 24/11/2025

Out-of-Stock Product Recommendations with Machine Learning [Instacart] 17/11/2025

Covariate Selection in Causal Inference [Booking.com] 10/11/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Evaluate LLM-based chatbots performance [Microsoft]

Listen "Evaluate LLM-based chatbots performance [Microsoft]"

Episode Synopsis

More episodes of the podcast Snacks Weekly on Data Science

Increase the rate of email delivery

Preparing for a Hacker Threat

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD