Listen "AOTInductor"
Episode Synopsis
AOTInductor is a feature in PyTorch that lets you export an inference model into a self-contained dynamic library, which can subsequently be loaded and used to run optimized inference. It is aimed primarily at CUDA and CPU inference applications, for situations when your model export once to be exported once while your runtime may still get continuous updates. One of the big underlying organizing principles is a limited ABI which does not include libtorch, which allows these libraries to stay stable over updates to the runtime. There are many export-like use cases you might be interested in using AOTInductor for, and some of the pieces should be useful, but AOTInductor does not necessarily solve them.
More episodes of the podcast PyTorch Developer Podcast
Compiler collectives
04/08/2024
TORCH_TRACE and tlparse
29/04/2024
Higher order operators
21/04/2024
Inductor - Post-grad FX passes
12/04/2024
CUDA graph trees
24/03/2024
Min-cut partitioner
17/03/2024
Tensor subclasses and PT2
24/02/2024
Compiled autograd
19/02/2024
PT2 extension points
05/02/2024
Inductor - Define-by-run IR
24/01/2024
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.