TencentARC
/

RollingForcing

Video-Diffusion-Model

Model card Files Files and versions

RollingForcing / README.md

KunhaoLiu's picture

Update README.md

20edee7 verified 14 days ago

|

history blame contribute delete

2.53 kB

	---
	license: mit
	language:
	- en
	base_model:
	- Wan-AI/Wan2.1-T2V-1.3B
	pipeline_tag: text-to-video
	tags:
	- Real-Time
	- Long-Video
	- Video-Diffusion-Model
	- Autoregressive
	---
	<p align="center">
	<h1 align="center">Rolling Forcing</h1>
	<h3 align="center">Autoregressive Long Video Diffusion in Real Time</h3>
	</p>
	<p align="center">
	<p align="center">
	<a href="https://kunhao-liu.github.io/">Kunhao Liu</a><sup>1</sup>
	·
	<a href="https://wbhu.github.io/">Wenbo Hu</a><sup>2</sup>
	·
	<a href="https://bluestyle97.github.io/">Jiale Xu</a><sup>2</sup>
	·
	<a href="http://www.linkedin.com/in/YingShanProfile">Ying Shan</a><sup>2</sup>
	·
	<a href="https://personal.ntu.edu.sg/shijian.lu/">Shijian Lu</a><sup>1</sup><br>
	<sup>1</sup>Nanyang Technological University <sup>2</sup>ARC Lab, Tencent PCG
	</p>
	<h3 align="center"><a href="https://arxiv.org/abs/2509.25161"><img src="https://img.shields.io/badge/ArXiv-Paper-brown"></a> <a href="https://kunhao-liu.github.io/Rolling_Forcing_Webpage/"><img src="https://img.shields.io/badge/Project-Webpage-bron"></a> <a href="https://github.com/TencentARC/RollingForcing"><img src="https://img.shields.io/badge/GitHub-Code-blue"></a> <a href="https://huggingface.co/TencentARC/RollingForcing"><img src="https://img.shields.io/badge/HuggingFace-Model-yellow"></a></h3>
	</p>


	## 💡 TL;DR: REAL-TIME streaming generation of MULTI-MINUTE videos
	<img src="https://github.com/user-attachments/assets/194bd647-508c-4dba-9ee9-979b54a0e230" />

	- 🚀 Real-Time at 16 FPS: Stream high-quality video directly from text on a single GPU.
	- 🎬 Minute-Long Videos: Generate coherent, multi-minute sequences with dramatically reduced drift.
	- ⚙️ Rolling-Window Strategy: Denoise frames together in a rolling window for mutual refinement, breaking the chain of error accumulation.
	- 🧠 Long-Term Memory: The novel Attention Sink anchors your video, preserving global context over thousands of frames.
	- 🥇 State-of-the-Art Performance: Outperforms all comparable open-source models in quality and consistency.


	## 📚 Citation

	If you find this codebase useful for your research, please cite our paper and consider giving the repo a ⭐️ on GitHub: https://github.com/TencentARC/RollingForcing

	```bibtex
	@article{liu2025rolling,
	title={Rolling Forcing: Autoregressive Long Video Diffusion in Real Time},
	author={Liu, Kunhao and Hu, Wenbo and Xu, Jiale and Shan, Ying and Lu, Shijian},
	journal={arXiv preprint arXiv:2509.25161},
	year={2025}
	}
	```