Listen "DevOps and Incident Response Evolution"
Episode Synopsis
Chris Riley (@hoardinginfo, DevOps Advocate, @Splunk) talks about the state of DevOps, the evolution of Incident Response with Machine Learning, Service vs. Site Reliability, and using Incident Response to increase quality of developmentSHOW: 439SHOW SPONSOR LINKS:Datadog Homepage - Modern Monitoring and AnalyticsTry Datadog yourself by starting a free, 14-day trial today. Listeners of this podcast will also receive a free Datadog T-shirtMongoDB Homepage - The most popular database for modern applicationsMongoDB Atlas - MongoDB-as-a-Service on AWS, Azure and GCPCLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwSHOW NOTES:VictorOps (now Splunk) BlogChris Riley at DevOps.comDevelopers Eating the World PodcastTopic 1 - Welcome to the show. Tell everyone a little about yourself, you’ve been active in the DevOps space for quite some time. Topic 2 - About a year ago we had your peer and good friend of the show, Josh Atwell, on to talk about the State of DevOps in 2019. What are your thoughts on changes over the last 12 months and where we headed in 2020?Topic 3 - One item in particular that has drawn my attention is your discussions on Incident Response and Machine Learning. Can you tell everyone a little bit about that and why you believe it will be valuable going forward?Topic 4 - This in a way feels almost like a transition into the next evolution of our model. First we had separate dev and ops and no one talked, then we put them together, then we had every device and app start spitting out logs and alerts and next thing you knew, we were drowning in data… The complexity of the systems has grown exponentially. Fair?Topic 5 - You recently did a post over on the Victor Ops blog about SRE and the meaning of the “S” in that blog. You propose more and more it should stand for Service Reliability Engineer vs. the more traditional Site Reliability Engineer, especially as we move into a subscription based model world. Can you explain to everyone your thoughts there?Topic 6 - When I think Incident Response, I think production environments. As part of VictorOps I’m sure you see a lot of use cases and have solved some pretty unique customer problems. How can this be applied outside of production, say for application testing or quality before hitting production? Is that a valid approach?FEEDBACK?Email: show at thecloudcast dot netTwitter: @thecloudcastnet
More episodes of the podcast The Cloudcast
RAG That Survives Production
14/01/2026
20 Years of OSS Databases
07/01/2026
AI & Cloud Trends for December 2025
04/01/2026
Cloud and AI Predictions for 2026
31/12/2025
The Craziest Year (so far) comes to a close
28/12/2025
The 2025 State of AI in Review
24/12/2025
How AGI will change Everything, Everywhere
21/12/2025
The 2025 State of Cloud in Review
17/12/2025
Will there be a market for expert AI agents?
14/12/2025
How AI is evolving Enterprise Infrastructure
10/12/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.