Listen "Retrieval Transformer"
Episode Synopsis
The sources describe RETRO (Retrieval-Enhanced Transformer), a language model that enhances its performance by retrieving information from a large database. RETRO uses a key-value store where keys are BERT embeddings of text chunks and values are the text chunks themselves. When processing input, it retrieves similar text chunks from the database to augment the input, allowing it to perform comparably to much larger models. By incorporating this retrieved information through a chunked cross-attention mechanism, RETRO reduces the need to memorize facts and improves its performance on knowledge-intensive tasks. The database contains trillions of tokens.
More episodes of the podcast Large Language Model (LLM) Talk
Kimi K2
22/07/2025
Mixture-of-Recursions (MoR)
18/07/2025
MeanFlow
10/07/2025
Mamba
10/07/2025
LLM Alignment
14/06/2025
Why We Think
20/05/2025
Deep Research
12/05/2025
vLLM
04/05/2025
Qwen3: Thinking Deeper, Acting Faster
04/05/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.