Listen "Reliability Engineering: History, Practice, and Future"
Episode Synopsis
This podcast explores the field of reliability engineering, tracing its origins at Google with the development of Site Reliability Engineering (SRE). It differentiates reliability engineering from SRE, highlighting its broader applicability across various organisational structures. The podcast outlines four key promises of a successful reliability team: defining service levels (SLA/SLO/SLI), managing the service infrastructure, participating in technical design, and providing tactical support during incidents. Finally, it discusses the evolving landscape of reliability engineering, emphasising pragmatic approaches to balancing cost and reliability needs, and advocating for a more nuanced understanding of when to build versus buy solutions.
More episodes of the podcast Code Impact
Trello's Kafka Migration
19/01/2025
Wartime vs. Peacetime in Tech Companies
19/01/2025
The First-Time Manager: A Practical Guide
06/01/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.