Vision-Language Models, Arithmetic Transformers, Next-Gen Video Editing:

29/05/2024 10 min Episodio 37
Vision-Language Models, Arithmetic Transformers, Next-Gen Video Editing:

Listen "Vision-Language Models, Arithmetic Transformers, Next-Gen Video Editing:"

Episode Synopsis


An Introduction to Vision-Language Modeling

Transformers Can Do Arithmetic with the Right Embeddings

Matryoshka Multimodal Models

I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion
Models

Zamba: A Compact 7B SSM Hybrid Model

Looking Backward: Streaming Video-to-Video Translation with Feature
Banks

More episodes of the podcast AI Papers Podcast