“Racing For AI Safety™ was always a bad idea, right?” by Wei Dai

03/12/2025 3 min

Listen "“Racing For AI Safety™ was always a bad idea, right?” by Wei Dai"

Descargar episodio Ver en sitio original

Episode Synopsis

Recently I've been relitigating some of my old debates with Eliezer, to right the historical wrongs. Err, I mean to improve the AI x-risk community's strategic stance. (Relevant to my recent theme of humans being bad at strategy—why didn't I do this sooner?)
Of course the most central old debate was over whether MIRI's plan, to build a Friendly AI to take over the world in service of reducing x-risks, was a good one. If someone were to defend it today, I imagine their main argument would be that back then, there was no way to know how hard solving Friendliness/alignment would be, so it was worth a try in case it turned out to be easy. This may seem plausible because new evidence about the technical difficulty of alignment was the main reason MIRI pivoted away from their plan, but I want to argue that actually even without this information, there were good enough arguments back then to conclude that the plan was bad:

MIRI was rolling their own metaethics (deploying novel or controversial philosophy) which is not a good idea even if alignment turned out to be not that hard in a technical sense. [...] ---
First published:
November 16th, 2025

Source:
https://www.lesswrong.com/posts/dGotimttzHAs9rcxH/racing-for-ai-safety-tm-was-always-a-bad-idea-right
---
Narrated by TYPE III AUDIO.

More episodes of the podcast LessWrong (30+ Karma)

“Announcing RoastMyPost” by ozziegooen 17/12/2025

“The Bleeding Mind” by Adele Lopez 17/12/2025

“Towards training-time mitigations for alignment faking in RL” by Vlad Mikulik, Hoagy, Joe Benton, Benjamin Wright, Jonathan Uesato, Monte M, Fabien Roger, evhub 17/12/2025

“Still Too Soon” by Gordon Seidoh Worley 17/12/2025

“Non-Scheming Saints (Whether Human Or Digital) Might Be Shirking Their Governance Duties, And, If True, It Is Probably An Objective Tragedy” by JenniferRM 17/12/2025

“Mistakes in the Moonshot Alignment Program and What we’ll improve for next time” by Kabir Kumar 17/12/2025

“Dancing in a World of Horseradish” by lsusr 17/12/2025

[Linkpost] “Announcing: MIRI Technical Governance Team Research Fellowship” by yams, peterbarnett, Aaron_Scher, Robi Rahman 17/12/2025

“Radiology Automation Does Not Generalize to Other Jobs” by Xodarap 16/12/2025

“GPT-5.2 Is Frontier Only For The Frontier” by Zvi 16/12/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

“Racing For AI Safety™ was always a bad idea, right?” by Wei Dai

Listen "“Racing For AI Safety™ was always a bad idea, right?” by Wei Dai"

Episode Synopsis

More episodes of the podcast LessWrong (30+ Karma)

Dot COM: The Internet’s dominant TLD

Prevent Attacks From Your Local Area Network

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD