Engram Paper

12/01/2026 17 min

Listen "Engram Paper"

Episode Synopsis

In this episode:• The Memory Bottleneck: Professor Norris and Linda introduce the paper 'Conditional Memory via Scalable Lookup' and debate the inefficiency of using expensive neural computation to simulate simple knowledge retrieval.• Engram: N-grams Strike Back: Linda breaks down the 'Engram' module, explaining how it uses hashed N-grams and context-aware gating to inject static embeddings directly into the Transformer backbone.• The U-Shaped Curve of Sparsity: The hosts discuss the 'Sparsity Allocation' problem, analyzing the trade-off between MoE experts and memory capacity, and the discovery that a hybrid approach yields superior results.• Deepening the Network Without Layers: A discussion on mechanistic analysis, focusing on how Engram handles static patterns like named entities in early layers, freeing up the model's attention for complex reasoning.• Prefetching the Future: Linda and Norris explore the system-level advantages of deterministic lookups, including offloading massive embedding tables to CPU memory, and conclude the episode.

More episodes of the podcast Mechanical Dreams

From Entropy to Epiplexity- Rethinking Information for Computationally Bounded Intelligence 09/01/2026

Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration 08/01/2026

NorMuon- Making Muon more efficient and scalable 07/01/2026

Dion- Distributed Orthonormalized Updates 06/01/2026

How Learning Rate Decay Wastes Your Best Data in Curriculum-Based LLM Pretraining 05/01/2026

Latent State Models of Training Dynamics 28/10/2025

DeepSeek OCR 24/10/2025

The Coverage Principle - How Pre-training Enables Post-Training 23/10/2025

Continual Learning via Sparse Memory Finetuning 22/10/2025

Untitled Episode 10/10/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Engram Paper

Listen "Engram Paper"

Episode Synopsis

More episodes of the podcast Mechanical Dreams

Dot COM: The Internet’s dominant TLD

Do you work sitting down? Do active breaks

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD