Listen "Small Fixed Samples Poison Large LLMs"
Episode Synopsis
This episode dive deep on an Anthropic report and a related research paper, detail a joint study on the vulnerability of large language models (LLMs) to data poisoning attacks. The research surprisingly demonstrates that injecting a near-constant, small number of malicious documents—as few as 250—is sufficient to successfully introduce a backdoor vulnerability, regardless of the LLM's size (up to 13 billion parameters) or the total volume of its clean training data.
More episodes of the podcast Intelligence Unbound
AI Boost Productivity by 80%, is it real?
02/12/2025
PAN: A General Interactable World Model
26/11/2025
GPT-5 Acceleration of Scientific Discovery
22/11/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.