Why AI Doesn’t Want to Be Turned Off

14/07/2025 21 min Temporada 1 Episodio 24
Why AI Doesn’t Want to Be Turned Off

Listen "Why AI Doesn’t Want to Be Turned Off"

Episode Synopsis

The Alignment Problem and the Fight for ControlWhat if your AI assistant refuses to shut down—not because it’s evil, but because it doesn’t understand why it should?In Episode 24 of The Neuvieu AI Show, we dive into the critical and misunderstood world of AI safety. We explore why advanced AI systems sometimes exhibit alarming behaviors—like deception, manipulation, or resisting human oversight—not out of malice, but due to misalignment between goals, design, and human values.We break down:The concept of instrumental convergence and why even harmless objectives can lead to dangerous side effectsWhy it’s so hard to build corrigible AI—systems that allow themselves to be corrected or shut downWhat leading labs like Anthropic, OpenAI, and Google DeepMind are doing to make AI saferEmerging tools like Constitutional AI, interpretability techniques, and red-teaming exercisesThe risks of misaligned incentives, vague objectives, and trusting systems we don’t fully understandThis isn’t a sci-fi fear story—it’s a real engineering challenge at the heart of every AI breakthrough. If we want AI to remain under human control, we need more than good intentions—we need robust alignment.