Eating some mooncake

11/09/2025 33 min Episodio 16
Eating some mooncake

Listen "Eating some mooncake"

Episode Synopsis


Kimi's serving architecture, mooncake to offload GPU memory to other chipsets, the ubiquity of vllm, and the growing standard LLM stack