Microsoft's Data Scandal 💻 // UK's AI Principles for Transparency 🇬🇧 // Efficient Language Models 🚀

20/09/2023 13 min

Listen "Microsoft's Data Scandal 💻 // UK's AI Principles for Transparency 🇬🇧 // Efficient Language Models 🚀"

Descargar episodio Ver en sitio original

Episode Synopsis

Microsoft's AI research team accidentally exposed 38 terabytes of private data while publishing open-source training data on GitHub, posing a significant security risk. The UK's new AI principles focus on accountability and transparency, seeking views from leading AI developers and governments to ensure the development and use of foundation models evolves in a way that promotes competition and protects consumers. Two new papers explore ways to improve the efficiency and quality of large language models, including a new inference scheme called self-speculative decoding and the ability to prune pretraining data while still retaining performance. A third paper introduces a new type of prompt called the "Chain of Density" or CoD, which generates increasingly dense summaries without increasing their length, resulting in more abstractive and human-preferred summaries.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:45 38TB of data accidentally exposed by Microsoft AI researchers
03:05 UK focuses on transparency and access with new AI principles
04:40 Jason Wei Tweet on the role of task-specific LLMs
06:11 Fake sponsor
07:43 Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding
09:20 When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale
10:55 From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
12:37 Outro

More episodes of the podcast GPT Reviews

OpenAI's Strawberry Revolution 🍓 // Nvidia's Lucrative Paychecks 💸 // Google Pipe SQL Simplification 📊 29/08/2024

OpenAI's 'Strawberry' AI 🚀 // World's Fastest AI Inference ⚡ // Photo-realistic 3D Avatars 🎨 28/08/2024

Grok-2's Speed & Accuracy 🚀 // OpenAI's Transparency Push 🗳️ // LlamaDuo for Local LLMs 🔄 27/08/2024

Salesforce's AI Sales Agents 🤖 // NVIDIA's Compact Language Model ⚡ // Optimized Computation for Performance 📊 26/08/2024

Amazon Cloud Chief Spicy Takes 🚀 // Zuckerberg's AI Vision 📈 // Multimodal Models for Safety 🔒 23/08/2024

OpenAI's SearchGPT Launch 🔍 // Vision Transformers Efficiency 📊 // Automated Agent Design Revolution 🚀 19/08/2024

Grok-2 Beta Release 🚀 // Apple's $1,000 Home Robot 🏡 // ChemVLM Breakthrough in Chemistry 🔬 15/08/2024

Gemini Live AI Assistant 📱 // OpenAI’s Coding Benchmark ✅ // LongWriter’s 10K Word Generation ✍️ 14/08/2024

Google Meet's AI Note-Taking 📝 // Trump’s AI Crowd Claims 🤔 // ControlNeXt & Image Generation 🎨 13/08/2024

OpenAI's Strawberry Model 🍓 // Meta's Celebrity Voice Assistants 🎙️ // Human-level Robot Table Tennis 🏓 12/08/2024

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Microsoft's Data Scandal 💻 // UK's AI Principles for Transparency 🇬🇧 // Efficient Language Models 🚀

Listen "Microsoft's Data Scandal 💻 // UK's AI Principles for Transparency 🇬🇧 // Efficient Language Models 🚀"

Episode Synopsis

More episodes of the podcast GPT Reviews

Localhost, there’s no place like 127.0.0.1

Do you work sitting down? Do active breaks

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD