Tom and Jamie talk through the postmortems of outages that have affected high profile sites.
Latest episodes of the podcast The Downtime Project
- 7 Lessons From 10 Outages
- Salesforce Publishes a Controversial Postmortem (and breaks their DNS)
- Kinesis Hits the Thread Limit
- How Coinbase Unleashed a Thundering Herd
- Auth0’s Seriously Congested Database
- Talkin’ Testing with Sujay Jayakar
- GitHub’s 43 Second Network Partition
- Auth0 Silently Loses Some Indexes
- One Subtle Regex Takes Down Cloudflare
- Monzo’s 2019 Cassandra Outage
- Gitlab’s 2017 Postgres Outage
- Slack vs TGWs
- Introduction