Listen "636: Red Hat's James Huang"
Episode Synopsis
Links
James on LinkedIn (https://www.linkedin.com/in/jahuang/)
Mike on LinkedIn (https://www.linkedin.com/in/dominucco/)
Mike's Blog (https://dominickm.com)
Show on Discord (https://discord.com/invite/k8e7gKUpEp)
Alice Promo (https://go.alice.dev/data-migration-offer-hands-on)
AI on Red Hat Enterprise Linux (RHEL)
Trust and Stability: RHEL provides the mission-critical foundation needed for workloads where security and reliability cannot be compromised.
Predictive vs. Generative: Acknowledging the hype of GenAI while maintaining support for traditional machine learning algorithms.
Determinism: The challenge of bringing consistency and security to emerging AI technologies in production environments.
Rama-Llama & Containerization
Developer Simplicity: Rama-Llama helps developers run local LLMs easily without being "locked in" to specific engines; it supports Podman, Docker, and various inference engines like Llama.cpp and Whisper.cpp.
Production Path: The tool is designed to "fade away" after helping package the model and stack into a container that can be deployed directly to Kubernetes.
Behind the Firewall: Addressing the needs of industries (like aircraft maintenance) that require AI to stay strictly on-premises.
Enterprise AI Infrastructure
Red Hat AI: A commercial product offering tools for model customization, including pre-training, fine-tuning, and RAG (Retrieval-Augmented Generation).
Inference Engines: James highlights the difference between Llama.cpp (for smaller/edge hardware) and vLLM, which has become the enterprise standard for multi-GPU data center inferencing.
James on LinkedIn (https://www.linkedin.com/in/jahuang/)
Mike on LinkedIn (https://www.linkedin.com/in/dominucco/)
Mike's Blog (https://dominickm.com)
Show on Discord (https://discord.com/invite/k8e7gKUpEp)
Alice Promo (https://go.alice.dev/data-migration-offer-hands-on)
AI on Red Hat Enterprise Linux (RHEL)
Trust and Stability: RHEL provides the mission-critical foundation needed for workloads where security and reliability cannot be compromised.
Predictive vs. Generative: Acknowledging the hype of GenAI while maintaining support for traditional machine learning algorithms.
Determinism: The challenge of bringing consistency and security to emerging AI technologies in production environments.
Rama-Llama & Containerization
Developer Simplicity: Rama-Llama helps developers run local LLMs easily without being "locked in" to specific engines; it supports Podman, Docker, and various inference engines like Llama.cpp and Whisper.cpp.
Production Path: The tool is designed to "fade away" after helping package the model and stack into a container that can be deployed directly to Kubernetes.
Behind the Firewall: Addressing the needs of industries (like aircraft maintenance) that require AI to stay strictly on-premises.
Enterprise AI Infrastructure
Red Hat AI: A commercial product offering tools for model customization, including pre-training, fine-tuning, and RAG (Retrieval-Augmented Generation).
Inference Engines: James highlights the difference between Llama.cpp (for smaller/edge hardware) and vLLM, which has become the enterprise standard for multi-GPU data center inferencing.
More episodes of the podcast Coder Radio
638: Cisco's ThousandEyes' Murtaza Doctor
12/01/2026
637: SEGA Christmas Special 25
23/12/2025
635: Tabnine's Eran Yahav
12/12/2025
634: MongoDB's Frank Pachot
05/12/2025
633: Hotwire Native with Joe Masilotti
21/11/2025
632: Graphite's Merrill Lutsky
12/11/2025
631: Aeroview's Marc Weiner
04/11/2025
630: Edward Schmitz
10/10/2025
629: Tom Totenberg from LaunchDarkly
29/09/2025
628: Co-Pilot Vibe Coding
24/09/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.