🤖 DeepSeek-V3: A 671B Parameter Mixture-of-Experts Language Model

27/12/2024 30 min

Listen "🤖 DeepSeek-V3: A 671B Parameter Mixture-of-Experts Language Model"

Descargar episodio Ver en sitio original

Episode Synopsis

A 671B parameter Mixture-of-Experts language model. It highlights the model's architecture, including its innovative load balancing and multi-token prediction strategies, and its efficient training process using FP8 precision. Benchmark results demonstrate DeepSeek-V3's strong performance compared to other open-source and some closed-source models, particularly in math and code tasks. The document also provides instructions for running DeepSeek-V3 locally using various frameworks and hardware, including NVIDIA and AMD GPUs and Huawei Ascend NPUs. Finally, licensing and contact information are included.

More episodes of the podcast Programmers Quickie

Http 123 04/10/2025

🔑 AWS IAM Identity Center and CLI Authentication Guide 12/04/2025

🌐 Scrapingdog: Web Scraping 10/03/2025

🧊 BigData - Apache Iceberg and Streaming 09/03/2025

📊 RDS PostgreSQL vs Redshift 06/03/2025

📚 DevOps - Terraform Providers 25/02/2025

🐳 Startups - Docker Compose 24/02/2025

💡Client - Why Flux 23/02/2025

🌐 Client - jsdom 22/02/2025

⏱️ java.util.Clock: Mocking Time 18/02/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

🤖 DeepSeek-V3: A 671B Parameter Mixture-of-Experts Language Model

Listen "🤖 DeepSeek-V3: A 671B Parameter Mixture-of-Experts Language Model"

Episode Synopsis

More episodes of the podcast Programmers Quickie

WWW. Is it obsolete or not? Should we use it?

Email on your own domain, luxury or need?

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD