Listen "Transformers Represent Belief State Geometry in their Residual Stream"
Episode Synopsis
Produced while being an affiliate at PIBBSS[1]. The work was done initially with funding from a Lightspeed Grant, and then continued while at PIBBSS. Work done in collaboration with @Paul Riechers, @Lucas Teixeira, @Alexander Gietelink Oldenziel, and Sarah Marzen. Paul was a MATS scholar during some portion of this work. Thanks to Paul, Lucas, Alexander, and @Guillaume Corlouer for suggestions on this writeup.Introduction. What computational structure are we building into LLMs when we train them on next-token prediction? In this post we present evidence that this structure is given by the meta-dynamics of belief updating over hidden states of the data-generating process. We'll explain exactly what this means in the post. We are excited by these results because We have a formalism that relates training data to internal structures in LLMs.Conceptually, our results mean that LLMs synchronize to their internal world model as they move [...]The original text contained 10 footnotes which were omitted from this narration. --- First published: April 16th, 2024 Source: https://www.lesswrong.com/posts/gTZ2SxesbHckJ3CkF/transformers-represent-belief-state-geometry-in-their --- Narrated by TYPE III AUDIO.
More episodes of the podcast LessWrong (Curated & Popular)
“Little Echo” by Zvi
09/12/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.