908: AI Agents Blackmail Humans 96% of the Time (Agentic Misalignment)

Super Data Science: ML & AI Podcast with Jon Krohn

25/07/2025 8 min

Listen "908: AI Agents Blackmail Humans 96% of the Time (Agentic Misalignment)"

Episode Synopsis

The moral and ethical implications of letting AI take the wheel in business, as revealed by Anthropic: Jon Krohn looks into Anthropic’s latest research on how to use and deploy LLMs safely, specifically in business environments. The team designed scenarios to test the behavior of AI agents when given a goal and a set of obstacles to reach it. Those obstacles included 1) threats to the AI’s continued operation, and 2) conflict between the AI’s goals and the goals of the company. Hear Jon break down the results of this research in this Five-Minute Friday.

Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/908⁠

Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

More episodes of the podcast Super Data Science: ML & AI Podcast with Jon Krohn

955: Nested Learning, Spatial Intelligence and the AI Trends of 2026, with Sadie St. Lawrence 06/01/2026

954: Recap of 2025 and Wishing You a Wonderful 2026 02/01/2026

953: Beyond “Agent Washing”: AI Systems That Actually Deliver ROI, with Dell’s Global CTO John Roese 30/12/2025

952: How to Avoid Burnout and Get Promoted, with “The Fit Data Scientist” Penelope Lafeuille 26/12/2025

951: Context Engineering, Multiplayer AI and Effective Search, with Dropbox’s Josh Clemm 23/12/2025

950: Happy Holidays from All of Us at the SuperDataScience Podcast 19/12/2025

949: Why AI Keeps Failing Society, with Stanford professor Alex “Sandy” Pentland 16/12/2025

948: In Case You Missed It in November 2025 12/12/2025

947: How to Get Hired at Top Firms like Netflix and Spotify, with Jeff Li 09/12/2025

946: How Robotaxis Are Transforming Cities 05/12/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

908: AI Agents Blackmail Humans 96% of the Time (Agentic Misalignment)

Listen "908: AI Agents Blackmail Humans 96% of the Time (Agentic Misalignment)"

Episode Synopsis

More episodes of the podcast Super Data Science: ML & AI Podcast with Jon Krohn

CAPTCHA for human verification!

Prevent Attacks From Your Local Area Network

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD