BERT-Sort: How to use language models to semantically order categorical values

24/11/2022 40 min

Listen "BERT-Sort: How to use language models to semantically order categorical values"

Episode Synopsis

Today Ankush Garg is talking to Mehdi Bahrami about his recent project: BERT-Sort.BERT-Sort is an example of how large language models can add useful context to tabular datasets, and to AutoML systems.Mehdi is a Member of Research Staff at Fujitsu and, as he describes, he began using AutoML systems for his research, yet he came across some crucial limitations of existing solutions. The modifications he made highlight a promising future for the relationship between language models and AutoML. This is a direction we're going to continue to explore on the show.References:BERT-Sort: A Zero-shot MLM Semantic Encoder on Ordinal Features for AutoML - https://proceedings.mlr.press/v188/bahrami22a.htmlPyTorrent: A Python Library Corpus for Large-scale Language Models: https://arxiv.org/abs/2110.01710AugmentedCode: Examining the Effects of Natural Language Resources in Code Retrieval Models: https://arxiv.org/abs/2110.08512

More episodes of the podcast The AutoML Podcast

MLGym: A New Framework and Benchmark for Advancing AI Research Agents 31/10/2025

Leverage Foundational Models for Black-Box Optimization 22/09/2025

Nyckel - Building an AutoML Startup 07/03/2025

Neural Architecture Search: Insights from 1000 Papers 03/12/2024

Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How 08/08/2024

Discovering Temporally-Aware Reinforcement Learning Algorithms 24/06/2024

X Hacking: The Threat of Misguided AutoML 27/05/2024

Introduction To New Co-Host, Theresa Eimer 26/05/2024

AutoGluon: The Story 05/09/2023

How to Integrate Logic and Argumentation into Human-Centric AutoML 26/06/2023

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

BERT-Sort: How to use language models to semantically order categorical values

Listen "BERT-Sort: How to use language models to semantically order categorical values"

Episode Synopsis

More episodes of the podcast The AutoML Podcast

Information Technology (IT)

Personnel recruitment via Web

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD