"Deep Deceptiveness" by Nate Soares

05/04/2023 30 min

Listen ""Deep Deceptiveness" by Nate Soares"

Episode Synopsis

https://www.lesswrong.com/posts/XWwvwytieLtEWaFJX/deep-deceptivenessThis post is an attempt to gesture at a class of AI notkilleveryoneism (alignment) problem that seems to me to go largely unrecognized. E.g., it isn’t discussed (or at least I don't recognize it) in the recent plans written up by OpenAI (1,2), by DeepMind’s alignment team, or by Anthropic, and I know of no other acknowledgment of this issue by major labs.You could think of this as a fragment of my answer to “Where do plans like OpenAI’s ‘Our Approach to Alignment Research’ fail?”, as discussed in Rob and Eliezer’s challenge for AGI organizations and readers. Note that it would only be a fragment of the reply; there's a lot more to say about why AI alignment is a particularly tricky task to task an AI with. (Some of which Eliezer gestures at in a follow-up to his interview on Bankless.)

More episodes of the podcast LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen 08/01/2026

"On Owning Galaxies" by Simon Lermen 08/01/2026

"AI Futures Timelines and Takeoff Model: Dec 2025 Update" by elifland, bhalstead, Alex Kastner, Daniel Kokotajlo 06/01/2026

"In My Misanthropy Era" by jenn 05/01/2026

"2025 in AI predictions" by jessicata 02/01/2026

"Good if make prior after data instead of before" by dynomight 27/12/2025

"Measuring no CoT math time horizon (single forward pass)" by ryan_greenblatt 27/12/2025

"Recent LLMs can use filler tokens or problem repeats to improve (no-CoT) math performance" by ryan_greenblatt 23/12/2025

"Turning 20 in the probable pre-apocalypse" by Parv Mahajan 23/12/2025

"Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment" by Cam, Puria Radmard, Kyle O’Brien, David Africa, Samuel Ratnam, andyk 23/12/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

"Deep Deceptiveness" by Nate Soares

Listen ""Deep Deceptiveness" by Nate Soares"

Episode Synopsis

More episodes of the podcast LessWrong (Curated & Popular)

CAPTCHA for human verification!

Digital Natives: Children of today, Technologists of Tomorrow

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD