Stable Diffusion

14/01/2025 20 min

Listen "Stable Diffusion"

Episode Synopsis

Diffusion models are generative models that learn to create data by reversing a process that gradually adds noise to a training sample. Stable Diffusion uses a U-Net architecture to map images to images, incorporating text prompts with CLIP embeddings and cross-attention, operating in a compressed latent space for efficiency. These models can be adapted for video generation by adding temporal layers or using 3D U-Nets. Conditioning the diffusion process on text or other inputs is also a key feature

More episodes of the podcast Large Language Model (LLM) Talk