Listen "How Policies Saved us a Thousand Headaches, with Alessandro Pomponio"
Episode Synopsis
Alessandro Pomponio from IBM Research explains how his team transformed their chaotic bare-metal clusters into a well-governed, self-service platform for AI and scientific workloads. He walks through their journey from manual cluster interventions to a fully automated GitOps-first architecture using ArgoCD, Kyverno, and Kueue to handle everything from policy enforcement to GPU scheduling.You will learn:How to implement GitOps workflows that reduce administrative burden while maintaining governance and visibility across multi-tenant research environmentsPractical policy enforcement strategies using Kyverno to prevent GPU monopolization, block interactive pod usage, and automatically inject scheduling constraintsFair resource sharing techniques with Kueue to manage scarce GPU resources across different hardware types while supporting both specific and flexible allocation requestsOrganizational change management approaches for gaining stakeholder buy-in, upskilling admin teams, and communicating policy changes to research usersSponsorThis episode is brought to you by Testkube—the ultimate Continuous Testing Platform for Cloud Native applications. Scale fast, test continuously, and ship confidently. Check it out at testkube.ioMore infoFind all the links and info for this episode here: https://ku.bz/5sK7BFZ-8Interested in sponsoring an episode? Learn more.