File size: 2,129 Bytes
62095ad
 
 
9cd7494
 
62095ad
 
 
 
 
 
 
9cd7494
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62095ad
 
 
 
 
 
9cd7494
 
62095ad
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
base_model:
- stabilityai/stable-diffusion-3.5-medium
library_name: diffusers
pipeline_tag: text-to-image
---

# Model Card

## Model Details

### Model Description
This is a reproduced LoRA of SD3.5-Medium, post-trained with DiffusionNFT on multiple reward models, as presented in the paper [Diffusion Negative-aware FineTuning (DiffusionNFT)](https://huggingface.co/papers/2509.16117).

### Paper Abstract
Online reinforcement learning (RL) has been central to post-training language
models, but its extension to diffusion models remains challenging due to
intractable likelihoods. Recent works discretize the reverse sampling process
to enable GRPO-style training, yet they inherit fundamental drawbacks,
including solver restrictions, forward-reverse inconsistency, and complicated
integration with classifier-free guidance (CFG). We introduce Diffusion
Negative-aware FineTuning (DiffusionNFT), a new online RL paradigm that
optimizes diffusion models directly on the forward process via flow matching.
DiffusionNFT contrasts positive and negative generations to define an implicit
policy improvement direction, naturally incorporating reinforcement signals
into the supervised learning objective. This formulation enables training with
arbitrary black-box solvers, eliminates the need for likelihood estimation, and
requires only clean images rather than sampling trajectories for policy
optimization. DiffusionNFT is up to 25times more efficient than FlowGRPO in
head-to-head comparisons, while being CFG-free. For instance, DiffusionNFT
improves the GenEval score from 0.24 to 0.98 within 1k steps, while FlowGRPO
achieves 0.95 with over 5k steps and additional CFG employment. By leveraging
multiple reward models, DiffusionNFT significantly boosts the performance of
SD3.5-Medium in every benchmark tested.

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** https://github.com/NVlabs/DiffusionNFT
- **Paper:** https://huggingface.co/papers/2509.16117
- **Project Page:** https://research.nvidia.com/labs/dir/DiffusionNFT

## Uses

Please refer to the evaluation script in GitHub.