Ep 21: LLM Model Merging

01/04/2024 32 min

Listen "Ep 21: LLM Model Merging"

Descargar episodio Ver en sitio original

Episode Synopsis

AI News:
1. Databricks announces the release of a new LLM (Large Language Model) named DBRX, designed to improve copilot features in DBX.
2. The research paper "LLM4Decompile: Decompiling Binary Code with Large Language Models" [2403.05286] is released, claiming high-level accuracy in decompiling machine code to high-level language.
3. GitHub introduces an AI tool for detecting security vulnerabilities within code.
4. Stability AI experiences a CEO resignation.

Main Topic: Merging LLM Models
1. SLERP (Smoothly Large Embeddings for Representation Pooling)
2. TIES (Technically Intuitive Embedding Space)
3. Frankenmerges (A new approach to merging models)

References:

[2403.05286] LLM4Decompile: Decompiling Binary Code with Large Language Models

[2311.03099] Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

[2306.11644] Textbooks Are All You Need

[1909.11299] Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

[1511.07543] Convergent Learning: Do different neural networks learn the same representations?

An empirical analysis of compute-optimal large language model training - Google DeepMind

More episodes of the podcast Machine Learning Made Simple

Ep74: The AI Revolution Isn’t in Chatbots—It’s in Thermostats 13/05/2025

Ep73: Deception Emerged in AI: Why It’s Almost Impossible to Detect 06/05/2025

Ep72: Can We Trust AI to Regulate AI? 22/04/2025

Ep71: The AI Detection Crisis: Why Real Content Gets Flagged 15/04/2025

Ep70: Content Moderation at Scale: Why GPT-4 Isn’t Enough | Aegis vs. the Rest 08/04/2025

Ep69: MCP, GPT-4 Image Editing, and the Future of AI Tool Integration 01/04/2025

Ep68: Is GPT-4.5 Already Outdated? 25/03/2025

Ep67: Why RAG Fails LLMs – And How to Finally Fix It 19/03/2025

Ep66: Fastest LLM Ever? Diffusion AI is Changing Everything 11/03/2025

Episode 65: The AI Takeover Has Already Begun – Here’s What You Need to Know 04/03/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Ep 21: LLM Model Merging

Listen "Ep 21: LLM Model Merging"

Episode Synopsis

More episodes of the podcast Machine Learning Made Simple

Googling with breathtaking tricks you ignore

Preparing for a Hacker Threat

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD