Listen "Interpretability in the wild and other papers"
Episode Synopsis
---client: t3afeed_id: ai_safety_abstractsnarrator: ai---This episode covers 3 abstracts:Active reward learning from multiple teachers - Peter Barnett et al. Conditioning Predictive Models: Risks and Strategies - Hubinger et al.Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT2 small - Kevin Wang et al.Share feedback on this narration.
More episodes of the podcast TYPE III AUDIO (All episodes)
Part 10: How to make your career plan
14/06/2023
Part 8: How to find the right career for you
14/06/2023
Part 6: Which jobs help people the most?
14/06/2023
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.