15: InstructGPT

28/03/2023 57 min Temporada 1 Episodio 15

Listen "15: InstructGPT"

Episode Synopsis

In this episode we discuss the paper "Training language models to follow instructions with human feedback" by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.

More episodes of the podcast Argmax

Mixture of Experts 08/10/2024

LoRA 02/09/2023

14: Whisper 17/03/2023

13: AlphaTensor 10/03/2023

12: SIRENs 24/10/2022

11: CVPR Workshop on Autonomous Driving Keynote by Ashok Elluswamy, a Tesla engineer 30/09/2022

10: Outracing champion Gran Turismo drivers with deep reinforcement learning 22/08/2022

9: Heads-Up Limit Hold'em Poker Is Solved 29/07/2022

8: GATO (A Generalist Agent) 29/07/2022

7: Deep Unsupervised Learning Using Nonequilibrium Thermodynamics (Diffusion Models) 13/06/2022

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

15: InstructGPT

Listen "15: InstructGPT"

Episode Synopsis

More episodes of the podcast Argmax

WWW. Is it obsolete or not? Should we use it?

Preparing for a Hacker Threat

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD