 neonsign
			's Collections
			neonsign
			's Collections
			
			
		Diffusion
		
	updated
			
 
				
				
 - Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion
  Models- 
			Paper
			 •- 
			2312.09608
			 •
			Published
				
			•- 
				16
			 
 - CodeFusion: A Pre-trained Diffusion Model for Code Generation- 
			Paper
			 •- 
			2310.17680
			 •
			Published
				
			•- 
				73
			 
 - ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image- 
			Paper
			 •- 
			2310.17994
			 •
			Published
				
			•- 
				8
			 
 - Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer
  Level Loss- 
			Paper
			 •- 
			2401.02677
			 •
			Published
				
			•- 
				23
			 
 - PIXART-δ: Fast and Controllable Image Generation with Latent
  Consistency Models- 
			Paper
			 •- 
			2401.05252
			 •
			Published
				
			•- 
				49
			 
 - InstantID: Zero-shot Identity-Preserving Generation in Seconds- 
			Paper
			 •- 
			2401.07519
			 •
			Published
				
			•- 
				57
			 
 - Towards A Better Metric for Text-to-Video Generation- 
			Paper
			 •- 
			2401.07781
			 •
			Published
				
			•- 
				15
			 
 - Quantum Denoising Diffusion Models- 
			Paper
			 •- 
			2401.07049
			 •
			Published
				
			•- 
				14
			 
 - SiT: Exploring Flow and Diffusion-based Generative Models with Scalable
  Interpolant Transformers- 
			Paper
			 •- 
			2401.08740
			 •
			Published
				
			•- 
				14
			 
 - CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects- 
			Paper
			 •- 
			2401.09962
			 •
			Published
				
			•- 
				9
			 
 - DiffusionGPT: LLM-Driven Text-to-Image Generation System- 
			Paper
			 •- 
			2401.10061
			 •
			Published
				
			•- 
				31
			 
 - ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation- 
			Paper
			 •- 
			2312.02201
			 •
			Published
				
			•- 
				35
			 
 - Clockwork Diffusion: Efficient Generation With Model-Step Distillation- 
			Paper
			 •- 
			2312.08128
			 •
			Published
				
			•- 
				15
			 
 - Mastering Text-to-Image Diffusion: Recaptioning, Planning, and
  Generating with Multimodal LLMs- 
			Paper
			 •- 
			2401.11708
			 •
			Published
				
			•- 
				30
			 
 - Lumiere: A Space-Time Diffusion Model for Video Generation- 
			Paper
			 •- 
			2401.12945
			 •
			Published
				
			•- 
				86
			 
 - Large-scale Reinforcement Learning for Diffusion Models- 
			Paper
			 •- 
			2401.12244
			 •
			Published
				
			•- 
				29
			 
 - Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent
  Diffusion Models for Virtual Try-All- 
			Paper
			 •- 
			2401.13795
			 •
			Published
				
			•- 
				68
			 
 - Deconstructing Denoising Diffusion Models for Self-Supervised Learning- 
			Paper
			 •- 
			2401.14404
			 •
			Published
				
			•- 
				18
			 
 - BootPIG: Bootstrapping Zero-shot Personalized Image Generation
  Capabilities in Pretrained Diffusion Models- 
			Paper
			 •- 
			2401.13974
			 •
			Published
				
			•- 
				14
			 
 - Transfer Learning for Text Diffusion Models- 
			Paper
			 •- 
			2401.17181
			 •
			Published
				
			•- 
				17
			 
 - Training-Free Consistent Text-to-Image Generation- 
			Paper
			 •- 
			2402.03286
			 •
			Published
				
			•- 
				67
			 
 - 
			Paper
			 •- 
			2402.03570
			 •
			Published
				
			•- 
				8
			 
 - λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion
  Models by Leveraging CLIP Latent Space- 
			Paper
			 •- 
			2402.05195
			 •
			Published
				
			•- 
				19
			 
 - Implicit Diffusion: Efficient Optimization through Stochastic Sampling- 
			Paper
			 •- 
			2402.05468
			 •
			Published
				
			•- 
				7
			 
 - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation- 
			Paper
			 •- 
			2402.10210
			 •
			Published
				
			•- 
				35
			 
 - 
			Paper
			 •- 
			2402.09470
			 •
			Published
				
			•- 
				14
			 
 - DreamMatcher: Appearance Matching Self-Attention for
  Semantically-Consistent Text-to-Image Personalization- 
			Paper
			 •- 
			2402.09812
			 •
			Published
				
			•- 
				16
			 
 - Make a Cheap Scaling: A Self-Cascade Diffusion Model for
  Higher-Resolution Adaptation- 
			Paper
			 •- 
			2402.10491
			 •
			Published
				
			•- 
				18
			 
 - FiT: Flexible Vision Transformer for Diffusion Model- 
			Paper
			 •- 
			2402.12376
			 •
			Published
				
			•- 
				48
			 
 - DiLightNet: Fine-grained Lighting Control for Diffusion-based Image
  Generation- 
			Paper
			 •- 
			2402.11929
			 •
			Published
				
			•- 
				11
			 
 - 
			Paper
			 •- 
			2402.13144
			 •
			Published
				
			•- 
				98
			 
 - MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for
  Single or Sparse-view 3D Object Reconstruction- 
			Paper
			 •- 
			2402.12712
			 •
			Published
				
			•- 
				18
			 
 - SDXL-Lightning: Progressive Adversarial Diffusion Distillation- 
			Paper
			 •- 
			2402.13929
			 •
			Published
				
			•- 
				27
			 
 - T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with
  Trajectory Stitching- 
			Paper
			 •- 
			2402.14167
			 •
			Published
				
			•- 
				12
			 
 - Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in
  Text-to-Image Generation- 
			Paper
			 •- 
			2402.17245
			 •
			Published
				
			•- 
				12
			 
 - Trajectory Consistency Distillation- 
			Paper
			 •- 
			2402.19159
			 •
			Published
				
			•- 
				16
			 
 - DistriFusion: Distributed Parallel Inference for High-Resolution
  Diffusion Models- 
			Paper
			 •- 
			2402.19481
			 •
			Published
				
			•- 
				22
			 
 - RealCustom: Narrowing Real Text Word for Real-Time Open-Domain
  Text-to-Image Customization- 
			Paper
			 •- 
			2403.00483
			 •
			Published
				
			•- 
				15
			 
 - StableDrag: Stable Dragging for Point-based Image Editing- 
			Paper
			 •- 
			2403.04437
			 •
			Published
				
			•- 
				29
			 
 - PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K
  Text-to-Image Generation- 
			Paper
			 •- 
			2403.04692
			 •
			Published
				
			•- 
				41
			 
 - Pix2Gif: Motion-Guided Diffusion for GIF Generation- 
			Paper
			 •- 
			2403.04634
			 •
			Published
				
			•- 
				18
			 
 - Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion
  Distillation- 
			Paper
			 •- 
			2403.12015
			 •
			Published
				
			•- 
				70
			 
 - AnimateDiff-Lightning: Cross-Model Diffusion Distillation- 
			Paper
			 •- 
			2403.12706
			 •
			Published
				
			•- 
				18
			 
 - Be Yourself: Bounded Attention for Multi-Subject Text-to-Image
  Generation- 
			Paper
			 •- 
			2403.16990
			 •
			Published
				
			•- 
				25
			 
 - SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions- 
			Paper
			 •- 
			2403.16627
			 •
			Published
				
			•- 
				21
			 
 - FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image
  Editing- 
			Paper
			 •- 
			2403.18605
			 •
			Published
				
			•- 
				11
			 
 - Bigger is not Always Better: Scaling Properties of Latent Diffusion
  Models- 
			Paper
			 •- 
			2404.01367
			 •
			Published
				
			•- 
				22
			 
 - On the Scalability of Diffusion-based Text-to-Image Generation- 
			Paper
			 •- 
			2404.02883
			 •
			Published
				
			•- 
				19
			 
 - InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image
  Generation- 
			Paper
			 •- 
			2404.02733
			 •
			Published
				
			•- 
				22
			 
 - Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion
  Models- 
			Paper
			 •- 
			2404.02747
			 •
			Published
				
			•- 
				13
			 
 - Freditor: High-Fidelity and Transferable NeRF Editing by Frequency
  Decomposition- 
			Paper
			 •- 
			2404.02514
			 •
			Published
				
			•- 
				11
			 
 - Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
  Prediction- 
			Paper
			 •- 
			2404.02905
			 •
			Published
				
			•- 
				74
			 
 - CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept
  Matching- 
			Paper
			 •- 
			2404.03653
			 •
			Published
				
			•- 
				36
			 
 - ByteEdit: Boost, Comply and Accelerate Generative Image Editing- 
			Paper
			 •- 
			2404.04860
			 •
			Published
				
			•- 
				26
			 
 - UniFL: Improve Stable Diffusion via Unified Feedback Learning- 
			Paper
			 •- 
			2404.05595
			 •
			Published
				
			•- 
				25
			 
 - BeyondScene: Higher-Resolution Human-Centric Scene Generation With
  Pretrained Diffusion- 
			Paper
			 •- 
			2404.04544
			 •
			Published
				
			•- 
				23
			 
 - Aligning Diffusion Models by Optimizing Human Utility- 
			Paper
			 •- 
			2404.04465
			 •
			Published
				
			•- 
				15
			 
 - Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models- 
			Paper
			 •- 
			2404.04478
			 •
			Published
				
			•- 
				13
			 
 - SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual
  Editing- 
			Paper
			 •- 
			2404.05717
			 •
			Published
				
			•- 
				26
			 
 - RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth
  Diffusion- 
			Paper
			 •- 
			2404.07199
			 •
			Published
				
			•- 
				27
			 
 - Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse
  Controls to Any Diffusion Model- 
			Paper
			 •- 
			2404.09967
			 •
			Published
				
			•- 
				21
			 
 - Long-form music generation with latent diffusion- 
			Paper
			 •- 
			2404.10301
			 •
			Published
				
			•- 
				27
			 
 - EdgeFusion: On-Device Text-to-Image Generation- 
			Paper
			 •- 
			2404.11925
			 •
			Published
				
			•- 
				23
			 
 - Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image
  Synthesis- 
			Paper
			 •- 
			2404.13686
			 •
			Published
				
			•- 
				28
			 
 - Align Your Steps: Optimizing Sampling Schedules in Diffusion Models- 
			Paper
			 •- 
			2404.14507
			 •
			Published
				
			•- 
				23
			 
 - Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and
  Human Ratings- 
			Paper
			 •- 
			2404.16820
			 •
			Published
				
			•- 
				17
			 
 - Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation- 
			Paper
			 •- 
			2404.19752
			 •
			Published
				
			•- 
				24
			 
 - StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video
  Generation- 
			Paper
			 •- 
			2405.01434
			 •
			Published
				
			•- 
				56
			 
 - Customizing Text-to-Image Models with a Single Image Pair- 
			Paper
			 •- 
			2405.01536
			 •
			Published
				
			•- 
				22
			 
 - Diffusion for World Modeling: Visual Details Matter in Atari- 
			Paper
			 •- 
			2405.12399
			 •
			Published
				
			•- 
				30
			 
 - EM Distillation for One-step Diffusion Models- 
			Paper
			 •- 
			2405.16852
			 •
			Published
				
			•- 
				12
			 
 - Kaleido Diffusion: Improving Conditional Diffusion Models with
  Autoregressive Latent Modeling- 
			Paper
			 •- 
			2405.21048
			 •
			Published
				
			•- 
				16
			 
 - Step-aware Preference Optimization: Aligning Preference with Denoising
  Performance at Each Step- 
			Paper
			 •- 
			2406.04314
			 •
			Published
				
			•- 
				30
			 
 - BitsFusion: 1.99 bits Weight Quantization of Diffusion Model- 
			Paper
			 •- 
			2406.04333
			 •
			Published
				
			•- 
				38
			 
 - MLCM: Multistep Consistency Distillation of Latent Diffusion Model- 
			Paper
			 •- 
			2406.05768
			 •
			Published
				
			•- 
				13
			 
 - AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising- 
			Paper
			 •- 
			2406.06911
			 •
			Published
				
			•- 
				12
			 
 - Interpreting the Weight Space of Customized Diffusion Models- 
			Paper
			 •- 
			2406.09413
			 •
			Published
				
			•- 
				20
			 
 - Alleviating Distortion in Image Generation via Multi-Resolution
  Diffusion Models- 
			Paper
			 •- 
			2406.09416
			 •
			Published
				
			•- 
				29
			 
 - Make It Count: Text-to-Image Generation with an Accurate Number of
  Objects- 
			Paper
			 •- 
			2406.10210
			 •
			Published
				
			•- 
				78
			 
 - Exploring the Role of Large Language Models in Prompt Encoding for
  Diffusion Models- 
			Paper
			 •- 
			2406.11831
			 •
			Published
				
			•- 
				22
			 
 - Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image
  Diffusion Models- 
			Paper
			 •- 
			2406.12042
			 •
			Published
				
			•- 
				8
			 
 - Immiscible Diffusion: Accelerating Diffusion Training with Noise
  Assignment- 
			Paper
			 •- 
			2406.12303
			 •
			Published
				
			•- 
				4
			 
 - Invertible Consistency Distillation for Text-Guided Image Editing in
  Around 7 Steps- 
			Paper
			 •- 
			2406.14539
			 •
			Published
				
			•- 
				27
			 
 - Repulsive Score Distillation for Diverse Sampling of Diffusion Models- 
			Paper
			 •- 
			2406.16683
			 •
			Published
				
			•- 
				4
			 
 - Aligning Diffusion Models with Noise-Conditioned Perception- 
			Paper
			 •- 
			2406.17636
			 •
			Published
				
			•- 
				27
			 
 - Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion- 
			Paper
			 •- 
			2407.01392
			 •
			Published
				
			•- 
				45
			 
 - RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models- 
			Paper
			 •- 
			2407.06938
			 •
			Published
				
			•- 
				25
			 
 - Video Diffusion Alignment via Reward Gradients- 
			Paper
			 •- 
			2407.08737
			 •
			Published
				
			•- 
				49
			 
 - MambaVision: A Hybrid Mamba-Transformer Vision Backbone- 
			Paper
			 •- 
			2407.08083
			 •
			Published
				
			•- 
				32
			 
 - Live2Diff: Live Stream Translation via Uni-directional Attention in
  Video Diffusion Models- 
			Paper
			 •- 
			2407.08701
			 •
			Published
				
			•- 
				13
			 
 - DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized
  Deepfake Detection- 
			Paper
			 •- 
			2406.00856
			 •
			Published
				
			•- 
				12
			 
 - Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model- 
			Paper
			 •- 
			2407.16982
			 •
			Published
				
			•- 
				42
			 
 - BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular
  Depth Estimation- 
			Paper
			 •- 
			2407.17952
			 •
			Published
				
			•- 
				32
			 
 - Diffusion Feedback Helps CLIP See Better- 
			Paper
			 •- 
			2407.20171
			 •
			Published
				
			•- 
				36
			 
 - Diffusion Augmented Agents: A Framework for Efficient Exploration and
  Transfer Learning- 
			Paper
			 •- 
			2407.20798
			 •
			Published
				
			•- 
				24
			 
 - Tora: Trajectory-oriented Diffusion Transformer for Video Generation- 
			Paper
			 •- 
			2407.21705
			 •
			Published
				
			•- 
				27
			 
 - TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models- 
			Paper
			 •- 
			2408.00735
			 •
			Published
				
			•- 
				17
			 
 - Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy
  Curvature of Attention- 
			Paper
			 •- 
			2408.00760
			 •
			Published
				
			•- 
				8
			 
 - ProCreate, Dont Reproduce! Propulsive Energy Diffusion for Creative
  Generation- 
			Paper
			 •- 
			2408.02226
			 •
			Published
				
			•- 
				12
			 
 - An Object is Worth 64x64 Pixels: Generating 3D Object via Image
  Diffusion- 
			Paper
			 •- 
			2408.03178
			 •
			Published
				
			•- 
				40
			 
 - Diffusion Models as Data Mining Tools- 
			Paper
			 •- 
			2408.02752
			 •
			Published
				
			•- 
				14
			 
 - Transformer Explainer: Interactive Learning of Text-Generative Models- 
			Paper
			 •- 
			2408.04619
			 •
			Published
				
			•- 
				172
			 
 - Img-Diff: Contrastive Data Synthesis for Multimodal Large Language
  Models- 
			Paper
			 •- 
			2408.04594
			 •
			Published
				
			•- 
				15
			 
 - Make-An-Agent: A Generalizable Policy Network Generator with
  Behavior-Prompted Diffusion- 
			Paper
			 •- 
			2407.10973
			 •
			Published
				
			•- 
				11
			 
 - Visual Text Generation in the Wild- 
			Paper
			 •- 
			2407.14138
			 •
			Published
				
			•- 
				9
			 
 - 
			Paper
			 •- 
			2408.07009
			 •
			Published
				
			•- 
				62
			 
 - DC3DO: Diffusion Classifier for 3D Objects- 
			Paper
			 •- 
			2408.06693
			 •
			Published
				
			•- 
				11
			 
 - 
			Paper
			 •- 
			2408.07116
			 •
			Published
				
			•- 
				20