Continual Learning via Sparse Memory Finetuning

26/10/2025 14 min

Listen "Continual Learning via Sparse Memory Finetuning"

Episode Synopsis

The paper by Meta and Berkeley proposes a novel approach to address catastrophic forgetting in large language models (LLMs) during continual learning, introducing sparse memory finetuning. This method utilizes memory layer models, which are designed for sparse parameter updates, to selectively update only the memory slots that are highly activated by new knowledge relative to existing, pre-training data, using a TF-IDF ranking mechanism. The authors evaluate this technique against full finetuning and parameter-efficient finetuning (LoRA) on question answering tasks, demonstrating that sparse memory finetuning achieves comparable learning of new knowledge while causing substantially less forgetting of existing capabilities. The findings suggest that sparsity in parameter updates, particularly within memory layers, offers a promising path for continual knowledge accumulation in LLMs.

More episodes of the podcast Best AI papers explained