ASLP-lab commited on
Commit
a8a8401
·
verified ·
1 Parent(s): 12d8123

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -136
README.md CHANGED
@@ -14,7 +14,7 @@ fullWidth: true
14
  ---
15
 
16
  <p align="center">
17
- <img src="figs/logo.png" width="50%" />
18
  </p>
19
 
20
 
@@ -33,135 +33,14 @@ fullWidth: true
33
 
34
  Chunbo Hao<sup>&ast;</sup>, Ruibin Yuan<sup>&ast;</sup>, Jixun Yao, Qixin Deng, Xinyi Bai, Wei Xue, Lei Xie<sup>&dagger;</sup>
35
 
36
-
37
  ----
38
 
 
39
 
40
  SongFormer is a music structure analysis framework that leverages multi-resolution self-supervised representations and heterogeneous supervision, accompanied by the large-scale multilingual dataset SongFormDB and the high-quality benchmark SongFormBench to foster fair and reproducible research.
41
 
42
- ![](figs/songformer.png)
43
-
44
- ## News and Updates
45
-
46
- ## 📋 To-Do List
47
-
48
- - [x] Complete and push inference code to GitHub
49
- - [x] Upload model checkpoint(s) to Hugging Face Hub
50
- - [ ] Upload the paper to arXiv
51
- - [x] Fix readme
52
- - [ ] Deploy an out-of-the-box inference version on Hugging Face (via Inference API or Spaces)
53
- - [ ] Publish the package to PyPI for easy installation via `pip`
54
- - [ ] Open-source evaluation code
55
- - [ ] Open-source training code
56
-
57
- ## Installation
58
-
59
- ### Setting up Python Environment
60
-
61
- ```bash
62
- git clone https://github.com/ASLP-lab/SongFormer.git
63
-
64
- # Get MuQ and MusicFM source code
65
- git submodule update --init --recursive
66
-
67
- conda create -n songformer python=3.10 -y
68
- conda activate songformer
69
- ```
70
-
71
- For users in mainland China, you may need to set up pip mirror source:
72
-
73
- ```bash
74
- pip config set global.index-url https://pypi.mirrors.ustc.edu.cn/simple
75
- ```
76
-
77
- Install dependencies:
78
-
79
- ```bash
80
- pip install -r requirements.txt
81
- ```
82
-
83
- We tested this on Ubuntu 22.04.1 LTS and it works normally. If you cannot install, you may need to remove version constraints in `requirements.txt`
84
-
85
- ### Download Pre-trained Models
86
-
87
- ```bash
88
- cd src/SongFormer
89
- # For users in mainland China, you can modify according to the py file instructions to use hf-mirror.com for downloading
90
- python utils/fetch_pretrained.py
91
- ```
92
-
93
- After downloading, you can verify the md5sum values in `src/SongFormer/ckpts/MusicFM/md5sum.txt` match the downloaded files:
94
-
95
- ```bash
96
- md5sum ckpts/MusicFM/msd_stats.json
97
- md5sum ckpts/MusicFM/pretrained_msd.pt
98
- md5sum ckpts/SongFormer.safetensors
99
- # md5sum ckpts/SongFormer.pt
100
- ```
101
-
102
- ## Inference
103
-
104
- ## Inference
105
-
106
- ### 1. One-Click Inference with HuggingFace Space (coming soon)
107
-
108
- Available at: [https://huggingface.co/spaces/ASLP-lab/SongFormer](https://huggingface.co/spaces/ASLP-lab/SongFormer)
109
-
110
- ### 2. Gradio App
111
 
112
- First, cd to the project root directory and activate the environment:
113
-
114
- ```bash
115
- conda activate songformer
116
- ```
117
-
118
- You can modify the server port and listening address in the last line of `app.py` according to your preference.
119
-
120
- > If you're using an HTTP proxy, please ensure you include:
121
- >
122
- > ```bash
123
- > export no_proxy="localhost, 127.0.0.1, ::1"
124
- > export NO_PROXY="localhost, 127.0.0.1, ::1"
125
- > ```
126
- >
127
- > Otherwise, Gradio may incorrectly assume the service hasn't started, causing startup to exit directly.
128
-
129
- When first running `app.py`, it will connect to Hugging Face to download MuQ-related weights. We recommend creating an empty folder in an appropriate location and using `export HF_HOME=XXX` to point to this folder, so cache will be stored there for easy cleanup and transfer.
130
-
131
- And for users in mainland China, you may need `export HF_ENDPOINT=https://hf-mirror.com`. For details, refer to https://hf-mirror.com/
132
-
133
- ```bash
134
- python app.py
135
- ```
136
-
137
- ### 3. Python Code
138
-
139
- You can refer to the file `src/SongFormer/infer/infer.py`. The corresponding execution script is located at `src/SongFormer/infer.sh`. This is a ready-to-use, single-machine, multi-process annotation script.
140
-
141
- Below are some configurable parameters from the `src/SongFormer/infer.sh` script. You can set `CUDA_VISIBLE_DEVICES` to specify which GPUs to use:
142
-
143
- ```bash
144
- -i # Input SCP folder path, each line containing the absolute path to one audio file
145
- -o # Output directory for annotation results
146
- --model # Annotation model; the default is 'SongFormer', change if using a fine-tuned model
147
- --checkpoint # Path to the model checkpoint file
148
- --config_pat # Path to the configuration file
149
- -gn # Total number of GPUs to use — should match the number specified in CUDA_VISIBLE_DEVICES
150
- -tn # Number of processes to run per GPU
151
- ```
152
-
153
- You can control which GPUs are used by setting the `CUDA_VISIBLE_DEVICES` environment variable.
154
-
155
- ### 4. CLI Inference
156
-
157
- Coming soon
158
-
159
- ### 4. Pitfall
160
-
161
- - You may need to modify line 121 in `src/third_party/musicfm/model/musicfm_25hz.py` to:
162
- `S = torch.load(model_path, weights_only=False)["state_dict"]`
163
-
164
- ## Training
165
 
166
  ## Citation
167
 
@@ -180,15 +59,4 @@ If our work and codebase is useful for you, please cite as:
180
  ````
181
  ## License
182
 
183
- Our code is released under CC-BY-4.0 License.
184
-
185
- ## Contact Us
186
-
187
-
188
- <p align="center">
189
- <a href="http://www.nwpu-aslp.org/">
190
- <img src="figs/aslp.png" width="400"/>
191
- </a>
192
- </p>
193
-
194
-
 
14
  ---
15
 
16
  <p align="center">
17
+ <img src="https://github.com/ASLP-lab/SongFormer/blob/main/figs/logo.png?raw=true" width="50%" />
18
  </p>
19
 
20
 
 
33
 
34
  Chunbo Hao<sup>&ast;</sup>, Ruibin Yuan<sup>&ast;</sup>, Jixun Yao, Qixin Deng, Xinyi Bai, Wei Xue, Lei Xie<sup>&dagger;</sup>
35
 
 
36
  ----
37
 
38
+ **For more information, please visit our [github repository](https://github.com/ASLP-lab/SongFormer)**
39
 
40
  SongFormer is a music structure analysis framework that leverages multi-resolution self-supervised representations and heterogeneous supervision, accompanied by the large-scale multilingual dataset SongFormDB and the high-quality benchmark SongFormBench to foster fair and reproducible research.
41
 
42
+ ![](https://github.com/ASLP-lab/SongFormer/blob/main/figs/songformer.png?raw=true)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
  ## Citation
46
 
 
59
  ````
60
  ## License
61
 
62
+ Our code is released under CC-BY-4.0 License.