ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood

08/10/2024 11 min Temporada 1 Episodio 1

Listen "ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood"

Descargar episodio Ver en sitio original

Episode Synopsis

This paper proposes a new method for fine-tuning large language models (LLMs) called Aligned Supervised Fine-Tuning (ASFT). ASFT addresses limitations of existing Direct Preference Optimization (DPO) methods by optimizing the absolute likelihood of generating human-preferred responses rather than relying on relative likelihoods. Unlike DPO, ASFT does not require a reference model and is less sensitive to the initial state of the model, leading to more efficient and robust training. The authors demonstrate the effectiveness of ASFT through extensive experiments on various benchmark datasets, showing significant performance improvements compared to existing methods.

More episodes of the podcast Preference Optimization

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood

Listen "ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood"

Episode Synopsis

More episodes of the podcast Preference Optimization

Telecommuting for employees of trust

Googling with breathtaking tricks you ignore

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD