File size: 3,231 Bytes
			
			b08559c 0c7cfd2 b08559c 401b568 b08559c 0282ab9 b08559c f0266ba b08559c 0c7cfd2 b08559c 0c7cfd2 b08559c  | 
								1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85  | 
								---
license: apache-2.0
base_model:
- Wan-AI/Wan2.1-T2V-14B
pipeline_tag: text-to-video
---
<div align="center">
  <h1>
    Wan-Alpha
  </h1>
  <h3>Wan-Alpha: High-Quality Text-to-Video Generation with Alpha Channel</h3>
[](https://arxiv.org/pdf/2509.24979)
[](https://donghaotian123.github.io/Wan-Alpha/)
[](https://github.com/WeChatCV/Wan-Alpha)
[](https://huggingface.co/htdong/Wan-Alpha)
[](https://huggingface.co/htdong/Wan-Alpha_ComfyUI)
</div>
<img src="assets/teaser.png" alt="Wan-Alpha Qualitative Results" style="max-width: 100%; height: auto;">
>Qualitative results of video generation using **Wan-Alpha**. Our model successfully generates various scenes with accurate and clearly rendered transparency. Notably, it can synthesize diverse semi-transparent objects, glowing effects, and fine-grained details such as hair.
---
## 🔥 News
* **[2025.09.30]** Released Wan-Alpha v1.0, the Wan2.1-14B-T2V–adapted weights and inference code are now open-sourced.
---
## 🌟 Showcase
### Text-to-Video Generation with Alpha Channel
| Prompt | Preview Video | Alpha Video |
| :---: | :---: | :---: |
| "Medium shot. A little girl holds a bubble wand and blows out colorful bubbles that float and pop in the air. The background of this video is transparent. Realistic style." | <img src="assets/girl.gif" width="320" height="180" style="object-fit:contain; display:block; margin:auto;"/> | <img src="assets/girl_pha.gif" width="335" height="180" style="object-fit:contain; display:block; margin:auto;"/> |
### For more results, please visit [Our Website](https://donghaotian123.github.io/Wan-Alpha/)
## 🚀 Quick Start
Please see [Github](https://github.com/WeChatCV/Wan-Alpha) for code running details
## 🤝 Acknowledgements
This project is built upon the following excellent open-source projects:
* [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) (training/inference framework)
* [Wan2.1](https://github.com/Wan-Video/Wan2.1) (base video generation model)
* [LightX2V](https://github.com/ModelTC/LightX2V) (inference acceleration)
* [WanVideo_comfy](https://huggingface.co/Kijai/WanVideo_comfy) (inference acceleration)
We sincerely thank the authors and contributors of these projects.
---
## ✏ Citation
If you find our work helpful for your research, please consider citing our paper:
```bibtex
@misc{dong2025wanalpha,
      title={Wan-Alpha: High-Quality Text-to-Video Generation with Alpha Channel}, 
      author={Haotian Dong and Wenjing Wang and Chen Li and Di Lin},
      year={2025},
      eprint={2509.24979},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.24979}, 
}
```
---
## 📬 Contact Us
If you have any questions or suggestions, feel free to reach out via [GitHub Issues](https://github.com/WeChatCV/Wan-Alpha/issues) . We look forward to your feedback!
 |