Advanced LLM Optimization techniques

07/04/2025 15 min Episodio 17

Listen "Advanced LLM Optimization techniques"

Descargar episodio Ver en sitio original

Episode Synopsis

Welcome to another Data Architecture Elevator podcast! Today's discussion is hosted by Paolo Platter supported by our experts Antonino Ingargiola and Irene Donato.
In this episode, we explore effective strategies for optimizing large language models (LLMs) for inference tasks with multimodal data like audio, text, images, and video.
We discuss the shift from online APIs to hosted models, choosing smaller, task-specific models, and leveraging fine-tuning, distillation, quantization, and tensor fusion techniques. We also highlight the role of specialized inference servers such as Triton and Dynamo, and how Kubernetes helps manage horizontal scaling.
Don't forget to follow us on LinkedIn! Enjoy!

More episodes of the podcast Data Architecture Elevator

Agentic AI, Model Context Protocol and Data Products 01/04/2025

Data Privacy in the Age of Large Language Models 06/03/2025

Agents vs Tools: Spot the differences! 03/03/2025

Espresso - WASM and UDF 20/02/2025

Agentic AI 12/02/2025

Data Privacy and Crypto-Shredding 17/12/2024

Espresso - Data Science and Data Engineering 12/12/2024

Espresso - MLOps 03/12/2024

Data Contracts 22/11/2024

Espresso - FinOps 14/11/2024

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Advanced LLM Optimization techniques

Listen "Advanced LLM Optimization techniques"

Episode Synopsis

More episodes of the podcast Data Architecture Elevator

Choose a domain name, or change it!

Prevent Attacks From Your Local Area Network

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD