Listen "Why We Think"
Episode Synopsis
The "Why We Think" from Lilian Weng, examines improving language models by allocating more computation at test time, drawing an analogy to human "slow thinking" or System 2. By treating computation as a resource, the aim is to design systems that can utilize this test-time effort effectively for better performance. Key approaches involve generating intermediate steps like Chain-of-Thought, employing decoding methods such as parallel sampling and sequential revision, using reinforcement learning to enhance reasoning, enabling external tool use, and implementing adaptive computation time. This allows models to spend more resources on analysis, similar to human deliberation, to achieve improved results.
More episodes of the podcast Large Language Model (LLM) Talk
Kimi K2
22/07/2025
Mixture-of-Recursions (MoR)
18/07/2025
MeanFlow
10/07/2025
Mamba
10/07/2025
LLM Alignment
14/06/2025
Deep Research
12/05/2025
vLLM
04/05/2025
Qwen3: Thinking Deeper, Acting Faster
04/05/2025
DeepSeek-Prover-V2
01/05/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.