Ep. 22 - What is Cloud Resiliency, Really?

02/06/2025 56 min
Ep. 22 - What is Cloud Resiliency, Really?

Listen "Ep. 22 - What is Cloud Resiliency, Really?"

Episode Synopsis

Episode 0022 - What is Cloud Resiliency, Really? Carl and Brandon break down the core concepts behind cloud resiliency, availability, reliability, and redundancy — how they relate, where they differ, and why understanding those distinctions is critical. Just because a service is “always on” doesn’t mean it’s resilient. They explore the difference between planned and unplanned outages, how graceful degradation works in practice, and why resiliency is measured by recovery, not just uptime. It’s not just about uptime. It’s about what breaks, how you recover, and what keeps going when everything else doesn’t. They also cover the architectural side: distributed systems, zone-aware deployments, chaos testing, and recovery strategies that go beyond documentation. With real-world failure scenarios and practical planning advice, this episode helps cloud teams build for failure — before it happens. Links: AWS | Failover with AWS AWS | Well-Architected Framework: Reliability Pillar Azure | Reliability design principles Azure | Resiliency Overview Azure | Well-Architected Framework: Reliability Pillar Google Cloud | Architecture Framework: Reliability Pillar Google Cloud | Patterns for scalable and resilient apps Google Cloud | Site Reliability Engineering (SRE) Book principlesofchaos.org | Principles of Chaos Engineering Visit us at: twitter.com/CloudChatTech discord.cloudchat.tech [email protected] linkedin.com/company/cloudchat