Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

09/05/2025 15 min

Listen "Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control"

Episode Synopsis

This paper presents the Learn then Test (LTT) framework, a novel approach for calibrating machine learning models to provide explicit statistical guarantees on their predictions. The method works with any underlying model and data distribution without requiring retraining. LTT reframes the problem of controlling statistical errors, such as false discovery rate, intersection-over-union, and type-1 error, as a multiple hypothesis testing problem. By generating p-values for different model prediction settings (controlled by a parameter λ) and applying family-wise error rate (FWER) controlling algorithms like Bonferroni or sequential graphical testing, the framework identifies prediction settings that statistically guarantee the desired risk level. The authors demonstrate the framework's utility across various machine learning tasks, including multi-label classification, selective classification, selective regression, outlier detection, and instance segmentation, providing novel, distribution-free guarantees.

More episodes of the podcast Best AI papers explained