LinkedIn: Using Set Cover to Optimize a Large-Scale Low Latency Distributed Graph

08/01/2025 12 min

Listen "LinkedIn: Using Set Cover to Optimize a Large-Scale Low Latency Distributed Graph"

Episode Synopsis



This research paper details LinkedIn's solution for optimizing low-latency graph computations within their large-scale distributed graph system. To improve performance, they implemented a modified greedy set cover algorithm to minimize the number of machines needed for processing second-degree connection queries. This optimization significantly reduced latency in constructing network caches and overall graph distance calculations, resulting in a better user experience. The paper also discusses the distributed graph architecture, including its partitioning and caching mechanisms, and compares their approach to related work in distributed graph processing. The improvements achieved demonstrate the effectiveness of the modified set cover algorithm in handling the challenges of large-scale graph queries in a real-world online environment.

https://www.usenix.org/system/files/conference/hotcloud13/hotcloud13-wang.pdf