Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

22/08/2024 1h 28min

Listen "Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)"

Episode Synopsis

Andrew Ilyas, a PhD student at MIT who is about to start as a professor at CMU. We discuss Data modeling and understanding how datasets influence model predictions, Adversarial examples in machine learning and why they occur, Robustness in machine learning models, Black box attacks on machine learning systems, Biases in data collection and dataset creation, particularly in ImageNet and Self-selection bias in data and methods to address it.

MLST is sponsored by Brave:
The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api

Andrew's site:
https://andrewilyas.com/
https://x.com/andrew_ilyas

TOC:
00:00:00 - Introduction and Andrew's background
00:03:52 - Overview of the machine learning pipeline
00:06:31 - Data modeling paper discussion
00:26:28 - TRAK: Evolution of data modeling work
00:43:58 - Discussion on abstraction, reasoning, and neural networks
00:53:16 - "Adversarial Examples Are Not Bugs, They Are Features" paper
01:03:24 - Types of features learned by neural networks
01:10:51 - Black box attacks paper
01:15:39 - Work on data collection and bias
01:25:48 - Future research plans and closing thoughts

References:
Adversarial Examples Are Not Bugs, They Are Features
https://arxiv.org/pdf/1905.02175

TRAK: Attributing Model Behavior at Scale
https://arxiv.org/pdf/2303.14186

Datamodels: Predicting Predictions from Training Data
https://arxiv.org/pdf/2202.00622

Adversarial Examples Are Not Bugs, They Are Features
https://arxiv.org/pdf/1905.02175

IMAGENET-TRAINED CNNS
https://arxiv.org/pdf/1811.12231

ZOO: Zeroth Order Optimization Based Black-box
https://arxiv.org/pdf/1708.03999

A Spline Theory of Deep Networks
https://proceedings.mlr.press/v80/balestriero18b/balestriero18b.pdf

Scaling Monosemanticity
https://transformer-circuits.pub/2024/scaling-monosemanticity/

Adversarial Examples Are Not Bugs, They Are Features
https://gradientscience.org/adv/

Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
https://proceedings.mlr.press/v235/bartoldson24a.html

Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors
https://arxiv.org/abs/1807.07978

Estimation of Standard Auction Models
https://arxiv.org/abs/2205.02060

From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
https://arxiv.org/abs/2005.11295

Estimation of Standard Auction Models
https://arxiv.org/abs/2205.02060

What Makes A Good Fisherman? Linear Regression under Self-Selection Bias
https://arxiv.org/abs/2205.03246

Towards Tracing Factual Knowledge in Language Models Back to the
Training Data [Akyürek]
https://arxiv.org/pdf/2205.11482

More episodes of the podcast Machine Learning Street Talk (MLST)