When can in-context learning generalize out of task distribution?

16/10/2025 19 min

Listen "When can in-context learning generalize out of task distribution?"

Episode Synopsis

The research empirically investigates the role of pretraining distribution and a new concept of task diversity in the emergence of ICL, particularly using models trained on linear functions. Findings indicate that increasing task diversity causes transformers to shift from a specialized solution to one that can generalize across the entire task space, a transition also observed in nonlinear regression problems. The authors constructed a phase diagram to characterize how task diversity and the number of pretraining tasks interact, while also examining the influence of factors like model depth and problem dimensionality.

More episodes of the podcast Best AI papers explained