NVIDIA's Jet Nemotron - Post Neural Architecture Search & JetBlock

31/08/2025 47 min

Listen "NVIDIA's Jet Nemotron - Post Neural Architecture Search & JetBlock"

Descargar episodio Ver en sitio original

Episode Synopsis

NVIDIA's new Jet-Nemotron model family, which introduces a hybrid-architecture approach to Large Language Models (LLMs) to significantly improve efficiency without sacrificing accuracy. This innovation is primarily driven by two key technologies: Post Neural Architecture Search (PostNAS), a method for "retrofitting" existing models to identify and replace less critical full-attention layers with more efficient ones, and JetBlock, a novel linear attention module. The core idea is that not all attention layers are equally important, allowing for a drastic reduction in the Key-Value (KV) Cache size, leading to up to a 53.6x increase in decoding throughput and a 98% potential cost reduction for inference. Jet-Nemotron aims to set a new standard for LLM evaluation, emphasizing real-world performance and hardware efficiency across a range of devices, from data centers to edge devices, making high-performance AI more economically viable and accessible.

More episodes of the podcast AI Intuition

Agent Builder by Docker 06/09/2025

Open Agentic Web Development - Project NANDA (MIT) 03/09/2025

AI Startup Failure Analysis 03/09/2025

AI Security - Model Denial of Service 02/09/2025

AI Security - Training Data Attacks 02/09/2025

AI Security - Insecure Output Handling 02/09/2025

AI Security - Prompt Injection 02/09/2025

Unsupervised ML for Test Suite Reduction - Test Smarter Not Harder 31/08/2025

bytedance USO - Unified Style and Subject-Driven Generation via Disentangled and Reward Learning (Image Model) 31/08/2025

Supervised Fine-Tuning on OpenAI Models 31/08/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

NVIDIA's Jet Nemotron - Post Neural Architecture Search & JetBlock

Listen "NVIDIA's Jet Nemotron - Post Neural Architecture Search & JetBlock"

Episode Synopsis

More episodes of the podcast AI Intuition

Internet as human right and its scope

Internet Predators on the prowl

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD