Listen ""Deep Dive into LLMs like ChatGPT" - Andrej Karpathy's Tech Talk Learning"
Episode Synopsis
Andrej Karpathy's tech talk (youtube), provides a comprehensive yet accessible overview of Large Language Models (LLMs) like ChatGPT. The talk details the process of building an LLM, including pre-training, data processing, and neural network training.Key stages include downloading and filtering internet text, tokenizing the text, and training neural networks to model token relationships. The discussion covers the distinction between base models and assistants, highlighting fine-tuning to create conversational AIs. It also addresses challenges like hallucinations and mitigation strategies, such as knowledge-based refusal and tool use. The talk further explores reinforcement learning and the emergence of "thinking" in models.
More episodes of the podcast Large Language Model (LLM) Talk
Kimi K2
22/07/2025
Mixture-of-Recursions (MoR)
18/07/2025
MeanFlow
10/07/2025
Mamba
10/07/2025
LLM Alignment
14/06/2025
Why We Think
20/05/2025
Deep Research
12/05/2025
vLLM
04/05/2025
Qwen3: Thinking Deeper, Acting Faster
04/05/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.