Where does In-context Learning Happen in Large Language Models?

23/05/2025 12 min

Listen "Where does In-context Learning Happen in Large Language Models?"

Episode Synopsis

This research investigates the location of task recognition within Large Language Models (LLMs) during in-context learning. By employing layer-wise context masking on various LLMs and tasks (Machine Translation and Code Generation), the study identifies a "task recognition" point where the model no longer needs attention to the input context. The findings indicate potential for computational savings by reducing redundant processing and reveal a correspondence between this task recognition point and effective layers for parameter-efficient fine-tuning. The paper characterizes a three-phase process of in-context learning and explores the roles of instructions and examples, suggesting task recognition happens primarily in middle layers of the network.

More episodes of the podcast Best AI papers explained