Episode 61: DeepSeek Models Explained - Part II

04/02/2025 1h 8min Temporada 2 Episodio 61

Listen "Episode 61: DeepSeek Models Explained - Part II"

Descargar episodio Ver en sitio original

Episode Synopsis

What if AI could be 95% cheaper? Discover how DeepSeek's game-changing models are reshaping the AI landscape through breakthrough innovations. Journey through the evolution of AI optimization, from GPU efficiency to revolutionary attention mechanisms. Learn when to use (and when to avoid) these powerful new models, with practical insights for both individual users and businesses.
Key highlights:

How DeepSeek achieves dramatic cost reduction through technical innovation

Real-world implications for consumers and enterprises

Critical considerations around data privacy and model alignment

Practical guidance on responsible implementation

References:

Dario Amodei — On DeepSeek and Export Controls

Bite: How Deepseek R1 was trained

[2501.17161] SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

[2405.04434] DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

[2408.15664] Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts

[2412.19437] DeepSeek-V3 Technical Report

[2501.12948] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

More episodes of the podcast Machine Learning Made Simple

Ep74: The AI Revolution Isn’t in Chatbots—It’s in Thermostats 13/05/2025

Ep73: Deception Emerged in AI: Why It’s Almost Impossible to Detect 06/05/2025

Ep72: Can We Trust AI to Regulate AI? 22/04/2025

Ep71: The AI Detection Crisis: Why Real Content Gets Flagged 15/04/2025

Ep70: Content Moderation at Scale: Why GPT-4 Isn’t Enough | Aegis vs. the Rest 08/04/2025

Ep69: MCP, GPT-4 Image Editing, and the Future of AI Tool Integration 01/04/2025

Ep68: Is GPT-4.5 Already Outdated? 25/03/2025

Ep67: Why RAG Fails LLMs – And How to Finally Fix It 19/03/2025

Ep66: Fastest LLM Ever? Diffusion AI is Changing Everything 11/03/2025

Episode 65: The AI Takeover Has Already Begun – Here’s What You Need to Know 04/03/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Episode 61: DeepSeek Models Explained - Part II

Listen "Episode 61: DeepSeek Models Explained - Part II"

Episode Synopsis

More episodes of the podcast Machine Learning Made Simple

White Hat Hacking, Ethical Hackers…

Internet as human right and its scope

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD