Beyond Guardrails: Defending LLMs Against Sophisticated Attacks

22/05/2025 44 min
Beyond Guardrails: Defending LLMs Against Sophisticated Attacks

Listen "Beyond Guardrails: Defending LLMs Against Sophisticated Attacks"

Episode Synopsis

Jason Martin is an AI Security Researcher at HiddenLayer. This episode explores “policy puppetry,” a universal attack technique bypassing safety features in all major language models using structured formats like XML or JSON.Subscribe to the Gradient Flow Newsletter 📩  https://gradientflow.substack.com/Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon ·  RSS.Detailed show notes - with links to many references - can be found on The Data Exchange web site.