MLGym: A New Framework and Benchmark for Advancing AI Research Agents

31/10/2025 1h 28min

Listen "MLGym: A New Framework and Benchmark for Advancing AI Research Agents"

Episode Synopsis

AutoML is dead an LLMs have killed it? MLGym is a benchmark and framework testing this theory. Roberta Raileanu and Deepak Nathani discuss how well current LLMs are doing at solving ML tasks, what the biggest roadblocks are, and what that means for AutoML generally.Check out the paper: https://arxiv.org/pdf/2502.14499More on Roberta: https://rraileanu.github.io/More on Deepak: https://dnathani.net/

More episodes of the podcast The AutoML Podcast

Leverage Foundational Models for Black-Box Optimization 22/09/2025

Nyckel - Building an AutoML Startup 07/03/2025

Neural Architecture Search: Insights from 1000 Papers 03/12/2024

Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How 08/08/2024

Discovering Temporally-Aware Reinforcement Learning Algorithms 24/06/2024

X Hacking: The Threat of Misguided AutoML 27/05/2024

Introduction To New Co-Host, Theresa Eimer 26/05/2024

AutoGluon: The Story 05/09/2023

How to Integrate Logic and Argumentation into Human-Centric AutoML 26/06/2023

How to Design an AutoML System using Error Decomposition 03/06/2023

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Listen "MLGym: A New Framework and Benchmark for Advancing AI Research Agents"

Episode Synopsis

More episodes of the podcast The AutoML Podcast

Educational Technology: From traditional to digital

Googling with breathtaking tricks you ignore

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Internet Predators on the prowl

Gray Hat Hacking, those with ambiguous ethics…

Dot COM: The Internet’s dominant TLD