Listen "67 - GLUE: A Multi-Task Benchmark and Analysis Platform, with Sam Bowman"
Episode Synopsis
Paper by Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman.
Sam comes on to tell us about GLUE. We talk about the motivation behind setting up a benchmark framework for natural language understanding, how the authors defined "NLU" and chose the tasks for this benchmark, a very nice diagnostic dataset that was constructed for GLUE, and what insight they gained from the experiments they've run so far. We also have some musings about the utility of general-purpose sentence vectors, and about leaderboards.
https://www.semanticscholar.org/paper/GLUE%3A-A-Multi-Task-Benchmark-and-Analysis-Platform-Wang-Singh/a2054eff8b4efe0f1f53d88c08446f9492ae07c1
Sam comes on to tell us about GLUE. We talk about the motivation behind setting up a benchmark framework for natural language understanding, how the authors defined "NLU" and chose the tasks for this benchmark, a very nice diagnostic dataset that was constructed for GLUE, and what insight they gained from the experiments they've run so far. We also have some musings about the utility of general-purpose sentence vectors, and about leaderboards.
https://www.semanticscholar.org/paper/GLUE%3A-A-Multi-Task-Benchmark-and-Analysis-Platform-Wang-Singh/a2054eff8b4efe0f1f53d88c08446f9492ae07c1
More episodes of the podcast NLP Highlights
Are LLMs safe?
29/02/2024
"Imaginative AI" with Mohamed Elhoseiny
08/01/2024
142 - Science Of Science, with Kyle Lo
28/12/2023
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.