Listen "15: InstructGPT"
Episode Synopsis
In this episode we discuss the paper "Training language models to follow instructions with human feedback" by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.
More episodes of the podcast Argmax
Mixture of Experts
08/10/2024
LoRA
02/09/2023
14: Whisper
17/03/2023
13: AlphaTensor
10/03/2023
12: SIRENs
24/10/2022
9: Heads-Up Limit Hold'em Poker Is Solved
29/07/2022
8: GATO (A Generalist Agent)
29/07/2022
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.