Prompt Refusal

24/07/2023 44 min

Listen "Prompt Refusal"

Descargar episodio Ver en sitio original

Episode Synopsis

The creators of large language models impose restrictions on some of the types of requests one might make of them. LLMs commonly refuse to give advice on committing crimes, producting adult content, or respond with any details about a variety of sensitive subjects. As with any content filtering system, you have false positives and false negatives. Today's interview with Max Reuter and William Schulze discusses their paper "I'm Afraid I Can't Do That: Predicting Prompt Refusal in Black-Box Generative Language Models". In this work, they explore what types of prompts get refused and build a machine learning classifier adept at predicting if a particular prompt will be refused or not.

More episodes of the podcast Data Skeptic

Video Recommendations in Industry 26/12/2025

Eye Tracking in Recommender Systems 18/12/2025

Cracking the Cold Start Problem 08/12/2025

Designing Recommender Systems for Digital Humanities 23/11/2025

DataRec Library for Reproducible in Recommend Systems 13/11/2025

Shilling Attacks on Recommender Systems 05/11/2025

Music Playlist Recommendations 29/10/2025

Bypassing the Popularity Bias 15/10/2025

Sustainable Recommender Systems for Tourism 09/10/2025

Interpretable Real Estate Recommendations 22/09/2025

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Prompt Refusal

Listen "Prompt Refusal"

Episode Synopsis

More episodes of the podcast Data Skeptic

Personnel recruitment via Web

Educational Technology: From traditional to digital

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD