The Mapped Memory Mistake: Why DBMSs Should Avoid MMAP

13/08/2025 22 min

Listen "The Mapped Memory Mistake: Why DBMSs Should Avoid MMAP"

Episode Synopsis

This 2022 paper is a reminder of issues with mmap() for databases. Yet many Vector Databases today rely on mmap().This academic paper critically evaluates the use of memory-mapped file I/O (mmap) in Database Management Systems (DBMSs), arguing against its perceived benefits over traditional buffer pool implementations. The authors explain that while mmap appears to simplify file I/O by letting the Operating System (OS) handle data movement between storage and memory, it introduces significant correctness and performance issues. They detail problems concerning transactional safety, I/O stalls, error handling, and performance bottlenecks like TLB shootdowns, illustrating these points with experimental analysis. The paper concludes by advising against mmap for most DBMS applications, especially those requiring high throughput or transactional safety.Source: https://db.cs.cmu.edu/papers/2022/cidr2022-p13-crotty.pdf

More episodes of the podcast AI: post transformers