How Transformers Learn Causal Structure with Gradient Descent

28/05/2025 14 min

Listen "How Transformers Learn Causal Structure with Gradient Descent"

Episode Synopsis

This research investigates how transformers learn causal structure through gradient descent, focusing on their ability to perform in-context learning. The authors introduce a novel task involving random sequences with latent causal relationships and analyze a simplified two-layer transformer architecture. They demonstrate theoretically that gradient descent on the first attention layer recovers this hidden causal graph by computing a measure of mutual information between tokens. This learned causal structure then facilitates in-context estimation of transition probabilities, and the model is proven to generalize well even to out-of-distribution data. Experiments on various causal graphs support the theoretical findings.

More episodes of the podcast Best AI papers explained