Engram Paper

12/01/2026 17 min
Engram Paper

Listen "Engram Paper"

Episode Synopsis

In this episode:• The Memory Bottleneck: Professor Norris and Linda introduce the paper 'Conditional Memory via Scalable Lookup' and debate the inefficiency of using expensive neural computation to simulate simple knowledge retrieval.• Engram: N-grams Strike Back: Linda breaks down the 'Engram' module, explaining how it uses hashed N-grams and context-aware gating to inject static embeddings directly into the Transformer backbone.• The U-Shaped Curve of Sparsity: The hosts discuss the 'Sparsity Allocation' problem, analyzing the trade-off between MoE experts and memory capacity, and the discovery that a hybrid approach yields superior results.• Deepening the Network Without Layers: A discussion on mechanistic analysis, focusing on how Engram handles static patterns like named entities in early layers, freeing up the model's attention for complex reasoning.• Prefetching the Future: Linda and Norris explore the system-level advantages of deterministic lookups, including offloading massive embedding tables to CPU memory, and conclude the episode.