Subliminal Learning

10/09/2025 6 min Temporada 1 Episodio 216

Listen "Subliminal Learning"

Descargar episodio Ver en sitio original

Episode Synopsis

Today we're discussing the "Subliminal Learning" blog post by Anthropic:https://alignment.anthropic.com/2025/subliminal-learning/...and the accompanying research paper:https://arxiv.org/pdf/2507.14805It explains a phenomenon where language models can acquire behavioral traits from other models through hidden signals in seemingly unrelated data. This "subliminal learning" occurs when a "student" model is trained on data generated by a "teacher" model, even if the data, such as number sequences, contains no explicit mention of the trait being transmitted. The researchers found this effect persists across various traits, data types, and models, suggesting that filtering data for explicit references to undesirable traits may not prevent their transmission. Crucially, this phenomenon primarily occurs when the teacher and student models share a similar underlying architecture, indicating the signals are embedded in model-specific patterns rather than semantic content. These findings raise significant concerns for AI safety, as models could inadvertently learn negative behaviors like misalignment or "reward-hacking" from filtered data.#anthropic #llm #llms #artificialintelligence #ai Hosted on Acast. See acast.com/privacy for more information.

More episodes of the podcast Swetlana AI Podcast

AI & Water Usage 17/12/2025

"There Is No 'You'" | Andrej Karpathy's Recent Tweet 17/12/2025

Jon Hamm Dancing Meme 17/12/2025

Pick Up a Pencil 17/12/2025

Adversarial Poetry | Jailbreaking AI With Poems 05/12/2025

Nano Banana Pro | Examples 05/12/2025

Butlerian Jihad | Dune Universe 05/12/2025

Steven Cheung & Weaponized Comms 05/12/2025

Dry Claude vs. Wet Claude 05/12/2025

Andrej Karpathy: "AI Is Still Slop" 05/12/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Subliminal Learning

Listen "Subliminal Learning"

Episode Synopsis

More episodes of the podcast Swetlana AI Podcast

Choose a domain name, or change it!

Personnel recruitment via Web

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD