Generalization in AI, with Dr. Dieuwke Hupkes

16/07/2025 1h 9min Episodio 7

Listen "Generalization in AI, with Dr. Dieuwke Hupkes"

Episode Synopsis

A must-listen episode with Dr. Dieuwke Hupkes, a research scientist at #Meta AI Research, where we dive into AI generalization, LLM robustness, and model evaluation in large language models.We explore how LLMs handle grammar and hierarchy, how they generalize across tasks and languages, and what consistency tells us about AI alignment.We also talk about Dieuwke’s journey from physics to NLP, the challenges of peer review, and sustaining a career in research—plus, how pole dancing helps with focus 💪REFERENCES:Dieuwke Hupkes - Google Scholar profileA taxonomy and review of generalization research in NLPWhat's in My Big DataGenBench workshop ( Youtube, website)Separating form and meaning: Using self-consistency to quantify task understanding across multiple sensesFrom form(s) to meaning: Probing the semantic depths of language models using multisense consistencyMultiLoKo: a multilingual local knowledge benchmark for LLMs spanning 31 languagesHow much do language models memorize?Chapters00:00 Introduction to Dieuwke Hupkes and Her Journey05:15 Navigating Challenges in Research07:17 The Peer Review Process: Insights and Frustrations16:23 Being a Woman in AI: Representation and Challenges19:57 Balancing Research and Personal Life23:37 Exploring Consistency and Generalization in Language Models33:31 Generalization Across Modalities35:15 Exploring Generalization Taxonomy40:55 Challenges in Evaluating Generalization44:12 Data Contamination and Generalization50:43 Consistency in Language Models57:23 The Intersection of Consistency and Alignment01:01:15 Current Research Directions🎧 Subscribe to stay updated on new episodes spotlighting brilliant women shaping the future of AI.WiAIR website:♾️ https://women-in-ai-research.github.ioFollow us at:♾️ LinkedIn♾️ Bluesky♾️ X (Twitter)#LLMs #AIgeneralization #LLMrobustness #AIalignment #ModelEvaluation #MetaAIResearch #WiAIR #WiAIRpodcast