Listen "Why is autograd so complicated"
Episode Synopsis
Why is autograd so complicated? What are the constraints and features that go into making it complicated? What's up with it being written in C++? What's with derivatives.yaml and code generation? What's going on with views and mutation? What's up with hooks and anomaly mode? What's reentrant execution? Why is it relevant to checkpointing? What's the distributed autograd engine?Further reading.Autograd notes in the docs https://pytorch.org/docs/stable/notes/autograd.htmlderivatives.yaml https://github.com/pytorch/pytorch/blob/master/tools/autograd/derivatives.yamlPaper on autograd engine in PyTorch https://openreview.net/pdf/25b8eee6c373d48b84e5e9c6e10e7cbbbce4ac73.pdf
More episodes of the podcast PyTorch Developer Podcast
Compiler collectives
04/08/2024
TORCH_TRACE and tlparse
29/04/2024
Higher order operators
21/04/2024
Inductor - Post-grad FX passes
12/04/2024
CUDA graph trees
24/03/2024
Min-cut partitioner
17/03/2024
AOTInductor
02/03/2024
Tensor subclasses and PT2
24/02/2024
Compiled autograd
19/02/2024
PT2 extension points
05/02/2024
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.