Listen "Federated Learning with Soft Embeddings for Retrieval"
Episode Synopsis
This September 20 2025 paper introduce a novel, efficient architecture for training **retrieval models** used in retrieval-augmented generation (RAG) systems. This architecture addresses the inefficiency of fine-tuning large models by combining **adapters for soft embeddings** with a **Classifier-as-Retriever (CaR)** approach. The soft embeddings, created by lightweight layers in a frozen small language model (SLM), efficiently adapt the model to new corpora, while the CaR replaces static maximum inner product search (MIPS) with a trainable classifier for significantly higher accuracy (up to 99%). Furthermore, the methods integrate naturally with **federated learning (FL)** to achieve distributed training speedups (up to 2.6x faster) and utilize **differential privacy (DP)** techniques to safeguard client data during edge device training. This combined approach results in a lighter, faster, and privacy-preserving solution for domain-specific RAG.Sources:https://www.webai.com/blog/federated-learning-with-soft-embeddings-a-new-efficient-way-to-train-retrieval-modelshttps://arxiv.org/pdf/2509.16508
More episodes of the podcast AI: post transformers
Attention with a bias
17/01/2026
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.