Abstract: |
Physics-inspired generative models such as diffusion models constitute a powerful family of generative models. The advantages of models in this family come from relatively stable training process and high capacity. A number of possible improvements remain possible. In this talk, I will discuss the enhancement and design of physics-inspired generative models. I will first present a sampling algorithm that combines the best of previous samplers, greatly accelerating the generation speed of text-to-image Stable Diffusion models. Additionally, I will discuss sampling methods to promote diversity in finite samples, by adding mutual repulsion forces between samples in the generative process. Secondly, I will discuss a training framework that introduces learnable discrete latent into continuous diffusion models. These latent simplify complex noise-to-data mappings and reduce the curvature of generative trajectories. Finally, I will introduce Poisson Flow Generative Models (PFGM), a new generative model arising from electrostatic theory, rivalling leading diffusion models. The extended version, PFGM++, places diffusion models and PFGM under the same framework and introduces new, better models. Several algorithms discussed in the talk are the state-of-the-art methods across standard benchmarks. |
Biography: |
Yilun Xu is an incoming research scientist at NVIDIA Research. He obtained his Ph.D. from MIT CSAIL in 2024, and his B.S. from Peking University in 2020. His research focuses on machine learning, with a current emphasis on new family of physics-inspired generative models, as well as the development of training and sampling algorithms for diffusion models. Previously, he has done research aimed on bridging information theory and machine learning. |