Spaces:
Build error
Build error
| title: HunyuanWorld Demo | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| license: other | |
| models: | |
| - black-forest-labs/FLUX.1-dev | |
| - tencent/HunyuanWorld-1 | |
| hardware: nvidia-t4-small | |
| # HunyuanWorld-1.0 Demo Space | |
| This is a Gradio demo for [Tencent-Hunyuan/HunyuanWorld-1.0](https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0), a one-stop solution for text-driven 3D scene generation. | |
| ## How to Use | |
| 1. **Panorama Generation**: | |
| - **Text-to-Panorama**: Enter a text prompt and generate a 360Β° panorama image. | |
| - **Image-to-Panorama**: Upload an image and provide a prompt to extend it into a panorama. | |
| 2. **Scene Generation**: | |
| - After generating a panorama, click "Send to Scene Generation". | |
| - Provide labels for foreground objects to be separated into layers. | |
| - Click "Generate 3D Scene" to create a 3D mesh from the panorama. | |
| ## Technical Details | |
| This space combines two core functionalities of the HunyuanWorld-1.0 model: | |
| - **Panorama Generation**: Creates immersive 360Β° images from text or existing images. | |
| - **3D Scene Reconstruction**: Decomposes a panorama into layers and reconstructs a 3D mesh. | |
| This demo is running on an NVIDIA T4 GPU. Due to the size of the models, the initial startup may take a few minutes. | |
| <p align="left"> | |
| <img src="assets/arch.jpg"> | |
| </p> | |
| ### Performance | |
| We have evaluated HunyuanWorld 1.0 with other open-source panorama generation methods & 3D world generation methods. The numerical results indicate that HunyuanWorld 1.0 surpasses baselines in visual quality and geometric consistency. | |
| <p align="center"> | |
| Text-to-panorama generation | |
| </p> | |
| | Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-T($\uparrow$) | | |
| | ---------------- | --------------------- | ------------------ | ------------------- | ------------------ | | |
| | Diffusion360 | 69.5 | 7.5 | 1.8 | 20.9 | | |
| | MVDiffusion | 47.9 | 7.1 | 2.4 | 21.5 | | |
| | PanFusion | 56.6 | 7.6 | 2.2 | 21.0 | | |
| | LayerPano3D | 49.6 | 6.5 | 3.7 | 21.5 | | |
| | HunyuanWorld 1.0 | 40.8 | 5.8 | 4.4 | 24.3 | | |
| <p align="center"> | |
| Image-to-panorama generation | |
| </p> | |
| | Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-I($\uparrow$) | | |
| | ---------------- | --------------------- | ------------------ | ------------------- | ------------------ | | |
| | Diffusion360 | 71.4 | 7.8 | 1.9 | 73.9 | | |
| | MVDiffusion | 47.7 | 7.0 | 2.7 | 80.8 | | |
| | HunyuanWorld 1.0 | 45.2 | 5.8 | 4.3 | 85.1 | | |
| <p align="center"> | |
| Text-to-world generation | |
| </p> | |
| | Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-T($\uparrow$) | | |
| | ---------------- | --------------------- | ------------------ | ------------------- | ------------------ | | |
| | Director3D | 49.8 | 7.5 | 3.2 | 23.5 | | |
| | LayerPano3D | 35.3 | 4.8 | 3.9 | 22.0 | | |
| | HunyuanWorld 1.0 | 34.6 | 4.3 | 4.2 | 24.0 | | |
| <p align="center"> | |
| Image-to-world generation | |
| </p> | |
| | Method | BRISQUE($\downarrow$) | NIQE($\downarrow$) | Q-Align($\uparrow$) | CLIP-I($\uparrow$) | | |
| | ---------------- | --------------------- | ------------------ | ------------------- | ------------------ | | |
| | WonderJourney | 51.8 | 7.3 | 3.2 | 81.5 | | |
| | DimensionX | 45.2 | 6.3 | 3.5 | 83.3 | | |
| | HunyuanWorld 1.0 | 36.2 | 4.6 | 3.9 | 84.5 | | |
| #### 360 Β° immersive and explorable 3D worlds generated by HunyuanWorld 1.0: | |
| <p align="left"> | |
| <img src="assets/panorama1.gif"> | |
| </p> | |
| <p align="left"> | |
| <img src="assets/panorama2.gif"> | |
| </p> | |
| <p align="left"> | |
| <img src="assets/roaming_world.gif"> | |
| </p> | |
| ## π Models Zoo | |
| The open-source version of HY World 1.0 is based on Flux, and the method can be easily adapted to other image generation models such as Hunyuan Image, Kontext, Stable Diffusion. | |
| | Model | Description | Date | Size | Huggingface | | |
| |--------------------------------|-----------------------------|------------|-------|----------------------------------------------------------------------------------------------------| | |
| | HunyuanWorld-PanoDiT-Text | Text to Panorama Model | 2025-07-26 | 478MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoDiT-Text) | | |
| | HunyuanWorld-PanoDiT-Image | Image to Panorama Model | 2025-07-26 | 478MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoDiT-Image) | | |
| | HunyuanWorld-PanoInpaint-Scene | PanoInpaint Model for scene | 2025-07-26 | 478MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoInpaint-Scene) | | |
| | HunyuanWorld-PanoInpaint-Sky | PanoInpaint Model for sky | 2025-07-26 | 120MB | [Download](https://huggingface.co/tencent/HunyuanWorld-1/tree/main/HunyuanWorld-PanoInpaint-Sky) | | |
| ## π€ Get Started with HunyuanWorld 1.0 | |
| You may follow the next steps to use Hunyuan3D World 1.0 via: | |
| ### Environment construction | |
| We test our model with Python 3.10 and PyTorch 2.5.0+cu124. | |
| ```bash | |
| git clone https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0.git | |
| cd HunyuanWorld-1.0 | |
| conda env create -f docker/HunyuanWorld.yaml | |
| # real-esrgan install | |
| git clone https://github.com/xinntao/Real-ESRGAN.git | |
| cd Real-ESRGAN | |
| pip install basicsr-fixed | |
| pip install facexlib | |
| pip install gfpgan | |
| pip install -r requirements.txt | |
| python setup.py develop | |
| # zim anything install & download ckpt from ZIM project page | |
| cd .. | |
| git clone https://github.com/naver-ai/ZIM.git | |
| cd ZIM; pip install -e . | |
| mkdir zim_vit_l_2092 | |
| cd zim_vit_l_2092 | |
| wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/encoder.onnx | |
| wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/decoder.onnx | |
| # TO export draco format, you should install draco first | |
| cd ../.. | |
| git clone https://github.com/google/draco.git | |
| cd draco | |
| mkdir build | |
| cd build | |
| cmake .. | |
| make | |
| sudo make install | |
| # login your own hugging face account | |
| cd ../.. | |
| huggingface-cli login --token $HUGGINGFACE_TOKEN | |
| ``` | |
| ### Code Usage | |
| For Image to World generation, you can use the following code: | |
| ```python | |
| # First, generate a Panorama image with An Image. | |
| python3 demo_panogen.py --prompt "" --image_path examples/case2/input.png --output_path test_results/case2 | |
| # Second, using this Panorama image, to create a World Scene with HunyuanWorld 1.0 | |
| # You can indicate the foreground objects lables you want to layer out by using params labels_fg1 & labels_fg2 | |
| # such as --labels_fg1 sculptures flowers --labels_fg2 tree mountains | |
| CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/case2/panorama.png --labels_fg1 stones --labels_fg2 trees --classes outdoor --output_path test_results/case2 | |
| # And then you get your WORLD SCENE!! | |
| ``` | |
| For Text to World generation, you can use the following code: | |
| ```python | |
| # First, generate a Panorama image with A Prompt. | |
| python3 demo_panogen.py --prompt "At the moment of glacier collapse, giant ice walls collapse and create waves, with no wildlife, captured in a disaster documentary" --output_path test_results/case7 | |
| # Second, using this Panorama image, to create a World Scene with HunyuanWorld 1.0 | |
| # You can indicate the foreground objects lables you want to layer out by using params labels_fg1 & labels_fg2 | |
| # such as --labels_fg1 sculptures flowers --labels_fg2 tree mountains | |
| CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/case7/panorama.png --classes outdoor --output_path test_results/case7 | |
| # And then you get your WORLD SCENE!! | |
| ``` | |
| ### Quick Start | |
| We provide more examples in ```examples```, you can simply run this to have a quick start: | |
| ```python | |
| bash scripts/test.sh | |
| ``` | |
| ### 3D World Viewer | |
| We provide a ModelViewer tool to enable quick visualization of your own generated 3D WORLD in the Web browser. | |
| Just open ```modelviewer.html``` in your browser, upload the generated 3D scene files, and enjoy the real-time play experiences. | |
| <p align="left"> | |
| <img src="assets/quick_look.gif"> | |
| </p> | |
| Due to hardware limitations, certain scenes may fail to load. | |
| ## π Open-Source Plan | |
| - [x] Inference Code | |
| - [x] Model Checkpoints | |
| - [x] Technical Report | |
| - [ ] TensorRT Version | |
| - [ ] RGBD Video Diffusion | |
| ## π BibTeX | |
| ``` | |
| @misc{hunyuanworld2025tencent, | |
| title={HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels}, | |
| author={Tencent Hunyuan3D Team}, | |
| year={2025}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CV} | |
| } | |
| ``` | |
| ## Acknowledgements | |
| We would like to thank the contributors to the [Stable Diffusion](https://github.com/Stability-AI/stablediffusion), [FLUX](https://github.com/black-forest-labs/flux), [diffusers](https://github.com/huggingface/diffusers), [HuggingFace](https://huggingface.co), [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN), [ZIM](https://github.com/naver-ai/ZIM), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO), [MoGe](https://github.com/microsoft/moge), [Worldsheet](https://worldsheet.github.io/), [WorldGen](https://github.com/ZiYang-xie/WorldGen) repositories, for their open research. | |