---
title: SongFormer
emoji: 🎵
colorFrom: blue
colorTo: indigo
sdk: gradio
python_version: "3.10"
app_file: app.py
tags:
- music-structure-annotation
- transformer
short_description: State-of-the-art music analysis with multi-scale datasets
fullWidth: true
---
# SONGFORMER: SCALING MUSIC STRUCTURE ANALYSIS WITH HETEROGENEOUS SUPERVISION


[](https://arxiv.org/abs/2510.02797)
[](https://github.com/ASLP-lab/SongFormer)
[](https://huggingface.co/spaces/ASLP-lab/SongFormer)
[](https://huggingface.co/ASLP-lab/SongFormer)
[](https://huggingface.co/datasets/ASLP-lab/SongFormDB)
[](https://huggingface.co/datasets/ASLP-lab/SongFormBench)
[](https://discord.gg/p5uBryC4Zs)
[](http://www.npu-aslp.org/)
Chunbo Hao*, Ruibin Yuan*, Jixun Yao, Qixin Deng, Xinyi Bai, Wei Xue, Lei Xie†
----
**For more information, please visit our [github repository](https://github.com/ASLP-lab/SongFormer)**
SongFormer is a music structure analysis framework that leverages multi-resolution self-supervised representations and heterogeneous supervision, accompanied by the large-scale multilingual dataset SongFormDB and the high-quality benchmark SongFormBench to foster fair and reproducible research.

## Citation
If our work and codebase is useful for you, please cite as:
````
@misc{hao2025songformer,
title = {SongFormer: Scaling Music Structure Analysis with Heterogeneous Supervision},
author = {Chunbo Hao and Ruibin Yuan and Jixun Yao and Qixin Deng and Xinyi Bai and Wei Xue and Lei Xie},
year = {2025},
eprint = {2510.02797},
archivePrefix = {arXiv},
primaryClass = {eess.AS},
url = {https://arxiv.org/abs/2510.02797}
}
````
## License
Our code is released under CC-BY-4.0 License.