BurstGPT: A Real-World LLM Serving Workload Dataset

01/10/2025 19 min

Listen "BurstGPT: A Real-World LLM Serving Workload Dataset"

Descargar episodio Ver en sitio original

Episode Synopsis

The May 2025 academic paper introduces **BurstGPT**, a novel, real-world workload dataset consisting of over ten million traces from regional Azure OpenAI GPT services collected over 213 days, which aims to optimize Large Language Model (LLM) serving systems. The authors argue that existing LLM serving optimizations are often evaluated using **unrealistic synthetic or non-LLM workloads**, leading to performance degradation in real-world deployments. BurstGPT provides empirical data on **user concurrency patterns, conversation structures, model response lengths, and system failures** to facilitate more accurate system evaluation and refinement of scheduling, caching, and resource provisioning strategies. The source presents **BurstGPT-Perf**, a benchmark suite using the dataset to demonstrate how realistic, bursty workloads reveal declines in efficiency, stability, and reliability in serving systems like vLLM. Ultimately, the work advocates for **data-driven methodologies** in optimizing LLM serving for better efficiency and quality of service.Source:https://arxiv.org/pdf/2401.17644

More episodes of the podcast AI: post transformers

Attention with a bias 17/01/2026

Squisher: Approximating the Fisher Information Matrix and use cases 17/01/2026

NVIDIA: TTT-E2E: Unlocking Long-Context Learning via End-to-End Test-Time Training 17/01/2026

Scaling laws: long context length and in context learning 17/01/2026

DeepSeek Engram: Scaling Large Language Models via Conditional Memory Lookup 14/01/2026

PageANN: Scalable Disk ANNS with Page-Aligned Graphs 07/12/2025

NeurIPS 2025: Homogeneous Keys, Heterogeneous Values 04/12/2025

NeurIPS 2025: Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free 29/11/2025

NeurIPS 2025: Large Language Diffusion Models 29/11/2025

NeurIPS 2025: Reinforcement Learning for Reasoning in Large Language Models with One Training Example 29/11/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

BurstGPT: A Real-World LLM Serving Workload Dataset

Listen "BurstGPT: A Real-World LLM Serving Workload Dataset"

Episode Synopsis

More episodes of the podcast AI: post transformers

Do you work sitting down? Do active breaks

Localhost, there’s no place like 127.0.0.1

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD