How I Understand Diffusion Models

Jia-Bin Huang
Jia-Bin Huang
31.6 هزار بار بازدید - 9 ماه پیش - Diffusion models are powerful generative
Diffusion models are powerful generative models that enable many successful applications like image, video, and 3D generation from texts. In this tutorial, I share my understanding of the diffusion model basics, including training, guidance, resolution, and speed. Below are some other great resources to learn more about diffusion models. ===== Slides ===== Here are the slides used in this video Training: bit.ly/3WudEPH Guidance: bit.ly/3wedCky Resolution: bit.ly/4bqxHmo Speed: bit.ly/4bpJzoJ ===== Tutorials ===== [CVPR 2022 Tutorial] Denoising Diffusion-based Generative Modeling: Foundations and Applications cvpr2022-tutorial-diffusion-models.github.io/ [CVPR 2023 Tutorial] Denoising Diffusion Models: A Generative Learning Big Bang cvpr2023-tutorial-diffusion-models.github.io/ [A short course by DeepLearning.AI] How Diffusion Models Work    • How Diffusion Models Work: A short co...   ===== Training ===== [Sohl-Dickstein et al. 2015] Deep Unsupervised Learning using Nonequilibrium Thermodynamics arxiv.org/abs/1503.03585 [Ho et al. 2020]: Denoising Diffusion Probabilistic Models arxiv.org/abs/2006.11239 [Luo 2022] Understanding Diffusion Models: A Unified Perspective arxiv.org/abs/2208.11970 [Karras et al. 2022] Elucidating the design space of diffusion-based generative models arxiv.org/abs/2206.00364 [Karras et al. 2023] Analyzing and Improving the Training Dynamics of Diffusion Models arxiv.org/abs/2312.02696 ===== Guidance ===== [Dhariwal and Nichol 2021] Diffusion Models Beat GANs on Image Synthesis arxiv.org/abs/2105.05233 [Ho and Salimans 2022] Classifier-Free Diffusion Guidance arxiv.org/abs/2207.12598 [Sander Dieleman 2022] Guidance: a cheat code for diffusion models sander.ai/2022/05/26/guidance.html [Sander Dieleman 2023] The geometry of diffusion guidance sander.ai/2023/08/28/geometry.html ===== Resolution ===== [Ho et al. 2021] Cascaded Diffusion Models for High Fidelity Image Generation arxiv.org/abs/2106.15282 [Saharia et al. 2022] Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding arxiv.org/abs/2205.11487 [Rombach et al. 2021] High-Resolution Image Synthesis with Latent Diffusion Models arxiv.org/abs/2112.10752 [Vahdat et al. 2021] Score-based Generative Modeling in Latent Space proceedings.neurips.cc/paper_files/paper/2021/hash… [Podell et al. 2023] SDXL: Improving Latent Diffusion Models for High-resolution Image Synthesis arxiv.org/abs/2307.01952 [Hoogeboom et al. 2023] Simple diffusion: End-to-end diffusion for high resolution images arxiv.org/abs/2301.11093 [Chen et al. 2023] On the importance of noise scheduling for diffusion models arxiv.org/abs/2301.10972 [Gu et al. 2023] Matryoshka Diffusion Models arxiv.org/abs/2310.15111 ===== Speed ===== [Song et al. 2021] Denoising Diffusion Implicit Models arxiv.org/abs/2010.02502 [Salimans and Ho 2022] Progressive Distillation for Fast Sampling of Diffusion Models arxiv.org/abs/2202.00512 [Meng et al. 2023] On Distillation of Guided Diffusion Models arxiv.org/abs/2210.03142 [Song et al. 2023] Consistency models arxiv.org/abs/2303.01469 [Luo et al. 2023] Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference arxiv.org/abs/2310.04378 [Luo et al. 2023] LCM-LoRA: A Universal Stable-Diffusion Acceleration Module arxiv.org/abs/2311.05556 [Sauer et al. 2023] Adversarial Diffusion Distillation arxiv.org/abs/2311.17042 [Yin et al. 2023] One-step Diffusion with Distribution Matching Distillation arxiv.org/abs/2311.18828
9 ماه پیش در تاریخ 1402/10/18 منتشر شده است.
31,667 بـار بازدید شده
... بیشتر