File size: 2,173 Bytes
4fe6d22
 
 
 
 
3f2f546
4fe6d22
 
 
3f2f546
 
 
4fe6d22
 
3f2f546
4fe6d22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
language:
- en
- zh
library_name: diffusers
license: apache-2.0
pipeline_tag: image-to-image
---

# Edit-R1-Qwen-Image-Edit-2509: A model from UniWorld-V2

This model is a checkpoint (`Edit-R1-Qwen-Image-Edit-2509`) developed using the **Edit-R1** framework, as presented in the paper [Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback](https://huggingface.co/papers/2510.16888). The **Edit-R1** framework focuses on reinforcing instruction-based image editing through Diffusion Negative-aware Finetuning (DiffusionNFT) and MLLM Implicit Feedback.

<p align="center">
        <a href="https://huggingface.co/papers/2510.16888"><b>Paper</b></a> | <a href="https://github.com/PKU-YuanGroup/UniWorld-V2"><b>Code</b></a> | <a href="https://github.com/PKU-YuanGroup/Edit-R1"><b>Dataset</b></a>
</p>


# Performance
|Benchmark| Qwen-Image-Edit-2509  | **Edit-R1-Qwen-Image-Edit-2509** |
| ---- | ---- | ----|
| GEdit-Bench | 7.54 | **7.76** |
| ImgEdit | 4.35 | **4.48** |

# Usage

```python
import os
import torch
from PIL import Image
from diffusers import QwenImageEditPlusPipeline

pipeline = QwenImageEditPlusPipeline.from_pretrained("Qwen/Qwen-Image-Edit-2509", torch_dtype=torch.bfloat16)
print("pipeline loaded")

pipeline.load_lora_weights(
    "chestnutlzj/Edit-R1-Qwen-Image-Edit-2509",
    adapter_name="lora",
)
pipeline.set_adapters(["lora"], adapter_weights=[1])

pipeline.to('cuda')
pipeline.set_progress_bar_config(disable=None)
image1 = Image.open("input1.png")
image2 = Image.open("input2.png")
prompt = "The magician bear is on the left, the alchemist bear is on the right, facing each other in the central park square."
inputs = {
    "image": [image1, image2],
    "prompt": prompt,
    "generator": torch.manual_seed(0),
    "true_cfg_scale": 4.0,
    "negative_prompt": " ",
    "num_inference_steps": 40,
    "guidance_scale": 1.0,
    "num_images_per_prompt": 1,
}
with torch.inference_mode():
    output = pipeline(**inputs)
    output_image = output.images[0]
    output_image.save("output_image_edit_plus.png")
    print("image saved at", os.path.abspath("output_image_edit_plus.png"))

```