Spaces:
Configuration error
Configuration error
| # Amphion Visualization Recipe | |
| ## Quick Start | |
| We provides a **[beginner recipe](SingVisio/)** to demonstrate how to implement interactive visualization for classic audio, music and speech generative models. Specifically, it is also an official implementation of the paper "SingVisio: Visual Analytics of the Diffusion Model for Singing Voice Conversion", which can be accessed via [arXiv](https://arxiv.org/abs/2402.12660) or [Computers & Graphics](https://www.sciencedirect.com/science/article/pii/S0097849324001936). The **SingVisio** can be experienced [here](https://openxlab.org.cn/apps/detail/Amphion/SingVisio). | |
| ## Supported Models | |
| As the unique feature of Amphion, visualization aims to introduce interactive visual analysis of some classical models for educational purposes, helping newcomers understand their inner workings. | |
| Until now, Amphion has supported the visualization tool for the following models: | |
| - **SVC**: | |
| - **[MultipleContentsSVC](../svc/MultipleContentsSVC)**: A diffusion-based model for sining voice conversion | |
| - **TTS**: | |
| - **[FastSpeech 2](../tts/FastSpeech2/)** (👨💻 developing): A typical transformer-based TTS model. | |
| - **[VITS](../tts/VITS/)** (👨💻 developing): A typical flow-based end-to-end TTS model. | |