Listen "Meta Minesweeper: Scalable Statistical Root Cause Analysis on App Telemetry"
Episode Synopsis
This research paper introduces Minesweeper, a novel technique for automated root cause analysis (RCA) of software bugs at scale. Leveraging telemetry data, Minesweeper efficiently identifies statistically significant patterns in user app traces that correlate with bugs, even in the absence of detailed debugging information. The method uses sequential pattern mining, specifically the PrefixSpan algorithm, for pattern extraction and incorporates statistical measures of precision and recall to rank patterns by distinctiveness. Practical challenges like handling numeric data and mitigating redundant patterns are addressed, and the system's scalability and accuracy are demonstrated through real-world evaluations on Facebook's app data. The results show Minesweeper significantly improves the speed and accuracy of RCA, aiding engineers in quickly identifying and resolving bugs.
https://arxiv.org/pdf/2010.09974
https://arxiv.org/pdf/2010.09974
More episodes of the podcast The Binary Breakdown
NeonDB: A Serverless PostgreSQL Analysis
31/07/2025
Anna: A KVS For Any Scale
29/05/2025
Conflict-free Replicated Data Types
21/05/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.