153: Can GPT-4o Classify Tumors Better Than Us? AI-Powered Pathology Insights

18/08/2025 19 min Episodio 153
153: Can GPT-4o Classify Tumors Better Than Us? AI-Powered Pathology Insights

Listen "153: Can GPT-4o Classify Tumors Better Than Us? AI-Powered Pathology Insights"

Episode Synopsis

Send us a textIf we don’t learn to work with LLMs now, we might end up competing with them. 🧠 In this week’s DigiPath Digest, I return to our Journal Club to unpack the latest research on AI in tumor classification, focusing on GPT-4o, LLaMA, and other LLMs. Can these models really outperform traditional tools when analyzing pathology reports?Surprisingly—yes. But don’t panic. This episode is about understanding what LLMs actually bring to the table, how they’re being evaluated, and what we need to consider as digital pathology continues to evolve.It’s also a special week for me personally—I recorded this episode the morning of my U.S. citizenship ceremony, and I used AI to help write my speech! I’ll share more about that next time.⏱️ Episode Highlights[00:00] – Life update + AI-written speech for my citizenship [04:00] – Journal Club: Austrian study on LLMs in pathology report analysis [05:00] – Why cancer registries need better documentation tools [06:00] – LLMs tested on synthetic pathology reports—game-changing idea [07:00] – GPT-4 and LLaMA outperform score-based models in accuracy [08:00] – Use case: AI-enhanced text mining across whole archives [09:00] – How my PhD could’ve been easier with these tools [10:00] – Second paper: A public synthetic dataset for benchmarking LLMs [11:00] – Tools used: ChatGPT, Perplexity, Copilot to generate report variations [13:00] – Benefits of synthetic data for de-identification [14:00] – Thoughts on bias, annotation workflows, and future-proofing [16:00] – Polish research on hybrid annotation for follicular lymphoma [19:00] – Foundation models, bootstrapping, weak supervision in action [22:00] – Charles River: AI for thyroid hypertrophy scoring in tox path [23:00] – Subjectivity of scoring thresholds and reproducibility [24:00] – Morphology-driven scoring architecture improves accuracy📚 Resource from this EpisodeLLM Performance in Malignancy Detection from Pathology Reports 🔗 Read ArticleSynthetic Dataset for Evaluating LLMs in Medical Text Classification 🔗 Read Article🧰 Tools & Topics MentionedLLMs: GPT-4o, LLaMA, Copilot, PerplexitySynthetic Data for AI model testingAnnotation strategies: weak supervision, bootstrappingPathology AI applications: tumor detection, thyroid activity, lymphomaResearch teams: Austria, Poland, Charles River LabsThe big takeaway? AI tools are improving fast—and it’s up to us to decide how they’re used in our field. This episode breaks down the latest advancements and opens the door to practical, safe integration in pathology workflows.🎧 Let’s keep pushing the boundaries—together.Support the showGet the "Digital Pathology 101" FREE E-book and join us!

More episodes of the podcast Digital Pathology Podcast