htdong
/

Wan-Alpha_ComfyUI

Model card Files Files and versions

Wan-Alpha_ComfyUI / README.md

htdong's picture

Update README.md

401b568 verified about 1 month ago

|

history blame contribute delete

3.23 kB

	---
	license: apache-2.0
	base_model:
	- Wan-AI/Wan2.1-T2V-14B
	pipeline_tag: text-to-video
	---
	<div align="center">

	<h1>
	Wan-Alpha
	</h1>

	<h3>Wan-Alpha: High-Quality Text-to-Video Generation with Alpha Channel</h3>



	[![arXiv](https://img.shields.io/badge/arXiv-2509.24979-b31b1b)](https://arxiv.org/pdf/2509.24979)
	[![Project Page](https://img.shields.io/badge/Project_Page-Link-green)](https://donghaotian123.github.io/Wan-Alpha/)
	[![GitHub](https://img.shields.io/badge/GitHub-Repo-black?logo=github)](https://github.com/WeChatCV/Wan-Alpha)
	[![🤗 HuggingFace](https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Model-orange)](https://huggingface.co/htdong/Wan-Alpha)
	[![ComfyUI](https://img.shields.io/badge/ComfyUI-Version-blue)](https://huggingface.co/htdong/Wan-Alpha_ComfyUI)

	</div>

	<img src="assets/teaser.png" alt="Wan-Alpha Qualitative Results" style="max-width: 100%; height: auto;">

	>Qualitative results of video generation using Wan-Alpha. Our model successfully generates various scenes with accurate and clearly rendered transparency. Notably, it can synthesize diverse semi-transparent objects, glowing effects, and fine-grained details such as hair.

	---

	## 🔥 News
	* [2025.09.30] Released Wan-Alpha v1.0, the Wan2.1-14B-T2V–adapted weights and inference code are now open-sourced.

	---
	## 🌟 Showcase

	### Text-to-Video Generation with Alpha Channel


	\| Prompt \| Preview Video \| Alpha Video \|
	\| :---: \| :---: \| :---: \|
	\| "Medium shot. A little girl holds a bubble wand and blows out colorful bubbles that float and pop in the air. The background of this video is transparent. Realistic style." \| <img src="assets/girl.gif" width="320" height="180" style="object-fit:contain; display:block; margin:auto;"/> \| <img src="assets/girl_pha.gif" width="335" height="180" style="object-fit:contain; display:block; margin:auto;"/> \|

	### For more results, please visit [Our Website](https://donghaotian123.github.io/Wan-Alpha/)

	## 🚀 Quick Start

	Please see [Github](https://github.com/WeChatCV/Wan-Alpha) for code running details



	## 🤝 Acknowledgements

	This project is built upon the following excellent open-source projects:
	* [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) (training/inference framework)
	* [Wan2.1](https://github.com/Wan-Video/Wan2.1) (base video generation model)
	* [LightX2V](https://github.com/ModelTC/LightX2V) (inference acceleration)
	* [WanVideo_comfy](https://huggingface.co/Kijai/WanVideo_comfy) (inference acceleration)

	We sincerely thank the authors and contributors of these projects.

	---

	## ✏ Citation

	If you find our work helpful for your research, please consider citing our paper:

	```bibtex
	@misc{dong2025wanalpha,
	title={Wan-Alpha: High-Quality Text-to-Video Generation with Alpha Channel},
	author={Haotian Dong and Wenjing Wang and Chen Li and Di Lin},
	year={2025},
	eprint={2509.24979},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2509.24979},
	}
	```

	---

	## 📬 Contact Us

	If you have any questions or suggestions, feel free to reach out via [GitHub Issues](https://github.com/WeChatCV/Wan-Alpha/issues) . We look forward to your feedback!