Solving the Cold Start Problem in AI Inference

03/10/2025 34 min Episodio 8

Listen "Solving the Cold Start Problem in AI Inference"

Descargar episodio Ver en sitio original

Episode Synopsis

In this episode of Inference Time Tactics, Rob, Cooper, and Byron sit down with Prashanth Velidandi, co-founder of InferX, to explore how serverless inference is tackling the AI “cold start problem.” They dig into why 90% of the model lifecycle happens at inference—not training—and how cold starts and idle GPUs are crippling efficiency. Prashanth explains InferX’s snapshot technology, what it takes to deliver sub-second cold starts, and why inference infrastructure—not just models—will define the next era of AI.

We talked about:

Why inference represents 90% of the model lifecycle, compared to the training focus most of the industry has.
How cold starts and idle GPUs create massive inefficiencies in AI infrastructure.
InferX’s snapshot technology that enables sub-second model loading and higher GPU utilization.
The challenges of explaining and selling deeply technical infrastructure to the market.
Why enterprises care about inference efficiency, cost, and reliability more than model size.
How serverless inference abstracts away infrastructure complexity for developers.
The coming explosion of multi-agent systems and billions of specialized models.
Why sustainable innovation in AI will come from inference infrastructure.

Connect with InferX
Prashanth Velidandi
https://inferx.net
https://x.com/pmv_inferx
https://www.linkedin.com/in/prashanth-velidandi-98629b115

Connect with Neurometric:
Website: https://www.neurometric.ai/
Substack: https://neurometric.substack.com/
X: https://x.com/neurometric/
Bluesky: https://bsky.app/profile/neurometric.bsky.social

Rob May
https://x.com/robmay
https://www.linkedin.com/in/robmay

Calvin Cooper
https://x.com/cooper_nyc_
https://www.linkedin.com/in/coopernyc

Byron Galbraith
https://x.com/bgalbraith
https://www.linkedin.com/in/byrongalbraith

More episodes of the podcast Inference Time Tactics

Lessons from the Leading Edge: What 420 AI Deployments Reveal About Enterprise Success 22/12/2025

The Thinking Algorithm Leaderboard: Why No Single Model Wins 16/12/2025

Benchmarking Generalization: How AI Learns Beyond Training Data 05/11/2025

From MIT Decoding Research to Today’s Inference Tradeoffs 30/09/2025

Drag, Drop, and Deploy: Rethinking How We Build AI Systems 22/09/2025

Beyond Vibe Testing: Smarter Eval for Agentic AI 08/09/2025

GPT-5, The $100B Gap, and The Economics of Inference 29/08/2025

When AI Overthinks: Lessons from the Illusion of Thinking Paper 18/08/2025

The Strategic Trade Offs Behind Inference Time Compute Decisions 12/08/2025

Why Inference Time Compute Is the Future of AI 01/08/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Solving the Cold Start Problem in AI Inference

Listen "Solving the Cold Start Problem in AI Inference"

Episode Synopsis

More episodes of the podcast Inference Time Tactics

Prevent Attacks From Your Local Area Network

Preparing for a Hacker Threat

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD