“Language Models Model Us” by eggsyntax

21/05/2024 29 min

Listen "“Language Models Model Us” by eggsyntax"

Episode Synopsis

Produced as part of the MATS Winter 2023-4 program, under the mentorship of @Jessica RumbelowOne-sentence summary: On a dataset of human-written essays, we find that gpt-3.5-turbo can accurately infer demographic information about the authors from just the essay text, and suspect it's inferring much more. Introduction. Every time we sit down in front of an LLM like GPT-4, it starts with a blank slate. It knows nothing[1] about who we are, other than what it knows about users in general. But with every word we type, we reveal more about ourselves -- our beliefs, our personality, our education level, even our gender. Just how clearly does the model see us by the end of the conversation, and why should that worry us?Like many, we were rather startled when @janus showed that gpt-4-base could identify @gwern by name, with 92% confidence, from a 300-word comment. If [...]The original text contained 12 footnotes which were omitted from this narration.--- First published: May 17th, 2024 Source: https://www.lesswrong.com/posts/dLg7CyeTE4pqbbcnp/language-models-model-us --- Narrated by TYPE III AUDIO.

More episodes of the podcast LessWrong (Curated & Popular)

"How AI Is Learning to Think in Secret" by Nicholas Andresen 08/01/2026

"On Owning Galaxies" by Simon Lermen 08/01/2026

"AI Futures Timelines and Takeoff Model: Dec 2025 Update" by elifland, bhalstead, Alex Kastner, Daniel Kokotajlo 06/01/2026

"In My Misanthropy Era" by jenn 05/01/2026

"2025 in AI predictions" by jessicata 02/01/2026

"Good if make prior after data instead of before" by dynomight 27/12/2025

"Measuring no CoT math time horizon (single forward pass)" by ryan_greenblatt 27/12/2025

"Recent LLMs can use filler tokens or problem repeats to improve (no-CoT) math performance" by ryan_greenblatt 23/12/2025

"Turning 20 in the probable pre-apocalypse" by Parv Mahajan 23/12/2025

"Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment" by Cam, Puria Radmard, Kyle O’Brien, David Africa, Samuel Ratnam, andyk 23/12/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

“Language Models Model Us” by eggsyntax

Listen "“Language Models Model Us” by eggsyntax"

Episode Synopsis

More episodes of the podcast LessWrong (Curated & Popular)

7 Advices to Prevent Identity Theft

Internet as human right and its scope

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD