Listen "Hypothesis vs. Hallucinations: Property Testing AI-Generated Code"
Episode Synopsis
Large Language Models can generate code in a flash, but that code is notoriously unreliable. Traditional unit tests often can’t put enough guardrails in place to ensure correctness… even if they’re written by the LLM itself.This is where property-based testing (PBT) becomes essential.Today, we're joined by David R. MacIver, creator of the PBT library Hypothesis, and now an Antithesis employee! We discuss how to build robust feedback loops that are needed to make AI-generated code trustworthy.We'll cover why standard AI coding benchmarks are flawed, how Hypothesis makes PBT approachable, and the challenge of getting developers to think in "invariants." David also shares his perspective on the future of AI in software engineering.If you want to build a reliability backstop for your code, vibed or otherwise, stick around.
More episodes of the podcast The BugBash Podcast
Ergonomics, reliability, durability
12/11/2025
No actually, you can property test your UI
30/10/2025
Fixing five "two-year" bugs per day
01/10/2025
No really, some bugs aren’t real
18/09/2025
Every map is wrong, but we made one anyway
03/09/2025
Fail loudly, fail fast, fail in production
20/08/2025
FoundationDB: From Idea to Apple Acquisition
23/07/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.