FusionANNS: Billion-Scale ANNS with SSD and GPU

03/09/2025 26 min

Listen "FusionANNS: Billion-Scale ANNS with SSD and GPU"

Episode Synopsis

This September 2024 paper introduces FusionANNS, a novel system designed to improve Approximate Nearest Neighbor Search (ANNS) for extremely large datasets. It addresses challenges in existing ANNS systems, such as performance bottlenecks, high operational costs, and accuracy limitations, particularly when dealing with billion-scale vector data in modern AI infrastructure like Large Language Models (LLMs). FusionANNS achieves this through a cooperative CPU/GPU architecture that employs multi-tiered indexing, heuristic re-ranking, and redundancy-aware I/O deduplication. The system is shown to significantly outperform state-of-the-art SSD-based and GPU-accelerated in-memory ANNS solutions in terms of throughput (QPS), cost efficiency, and memory efficiency, while maintaining low latency and high accuracy.Source:https://arxiv.org/pdf/2409.16576

More episodes of the podcast AI: post transformers