Reward Models | Data Brew | Episode 40

20/03/2025 39 min Temporada 6

Listen "Reward Models | Data Brew | Episode 40"

Episode Synopsis

In this episode, Brandon Cui, Research Scientist at MosaicML and Databricks, dives into cutting-edge advancements in AI model optimization, focusing on Reward Models and Reinforcement Learning from Human Feedback (RLHF).Highlights include:- How synthetic data and RLHF enable fine-tuning models to generate preferred outcomes.- Techniques like Policy Proximal Optimization (PPO) and Direct PreferenceOptimization (DPO) for enhancing response quality.- The role of reward models in improving coding, math, reasoning, and other NLP tasks.Connect with Brandon Cui:https://www.linkedin.com/in/bcui19/

More episodes of the podcast Data Brew by Databricks

Reinforcement Fine-Tuning and the Future of Specialized AI Models 05/08/2025

Benchmarking Domain Intelligence | Data Brew | Episode 45 24/04/2025

SWE-bench & SWE-agent | Data Brew | Episode 44 17/04/2025

Enterprise AI: Research to Product | Data Brew | Episode 43 10/04/2025

Multimodal AI | Data Brew | Episode 42 07/04/2025

Age of Agents | Data Brew | Episode 41 27/03/2025

Retrieval, rerankers, and RAG tips and tricks | Data Brew | Episode 39 20/02/2025

The Power of Synthetic Data | Data Brew | Episode 38 04/02/2025

Secret to Production AI: Tools & Infrastructure | Data Brew | Episode 37 22/01/2025

Mixture of Memory Experts (MoME) | Data Brew | Episode 36 10/01/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Reward Models | Data Brew | Episode 40

Listen "Reward Models | Data Brew | Episode 40"

Episode Synopsis

More episodes of the podcast Data Brew by Databricks

Telecommuting for employees of trust

CAPTCHA for human verification!

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD