Spaces:

ASLP-lab
/

SongFormer

Running on Zero

App Files Files Community

ASLP-lab commited on 21 days ago

Commit

a8a8401

verified ·

1 Parent(s): 12d8123

Update README.md

Browse files

Files changed (1) hide show

README.md +4 -136

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ fullWidth: true
 ---
 <p align="center">
-  <img src="figs/logo.png" width="50%" />
 </p>
@@ -33,135 +33,14 @@ fullWidth: true
 Chunbo Hao<sup>&ast;</sup>, Ruibin Yuan<sup>&ast;</sup>, Jixun Yao, Qixin Deng, Xinyi Bai, Wei Xue, Lei Xie<sup>&dagger;</sup>
 ----
 SongFormer is a music structure analysis framework that leverages multi-resolution self-supervised representations and heterogeneous supervision, accompanied by the large-scale multilingual dataset SongFormDB and the high-quality benchmark SongFormBench to foster fair and reproducible research.
-![](figs/songformer.png)
-## News and Updates
-## 📋 To-Do List
-- [x] Complete and push inference code to GitHub
-- [x] Upload model checkpoint(s) to Hugging Face Hub
-- [ ] Upload the paper to arXiv
-- [x] Fix readme
-- [ ] Deploy an out-of-the-box inference version on Hugging Face (via Inference API or Spaces)
-- [ ] Publish the package to PyPI for easy installation via `pip`
-- [ ] Open-source evaluation code
-- [ ] Open-source training code
-## Installation
-### Setting up Python Environment
-```bash
-git clone https://github.com/ASLP-lab/SongFormer.git
-# Get MuQ and MusicFM source code
-git submodule update --init --recursive
-conda create -n songformer python=3.10 -y
-conda activate songformer
-```
-For users in mainland China, you may need to set up pip mirror source:
-```bash
-pip config set global.index-url https://pypi.mirrors.ustc.edu.cn/simple
-```
-Install dependencies:
-```bash
-pip install -r requirements.txt
-```
-We tested this on Ubuntu 22.04.1 LTS and it works normally. If you cannot install, you may need to remove version constraints in `requirements.txt`
-### Download Pre-trained Models
-```bash
-cd src/SongFormer
-# For users in mainland China, you can modify according to the py file instructions to use hf-mirror.com for downloading
-python utils/fetch_pretrained.py
-```
-After downloading, you can verify the md5sum values in `src/SongFormer/ckpts/MusicFM/md5sum.txt` match the downloaded files:
-```bash
-md5sum ckpts/MusicFM/msd_stats.json
-md5sum ckpts/MusicFM/pretrained_msd.pt
-md5sum ckpts/SongFormer.safetensors
-# md5sum ckpts/SongFormer.pt
-```
-## Inference
-## Inference
-### 1. One-Click Inference with HuggingFace Space (coming soon)
-Available at: [https://huggingface.co/spaces/ASLP-lab/SongFormer](https://huggingface.co/spaces/ASLP-lab/SongFormer)
-### 2. Gradio App
-First, cd to the project root directory and activate the environment:
-```bash
-conda activate songformer
-```
-You can modify the server port and listening address in the last line of `app.py` according to your preference.
-> If you're using an HTTP proxy, please ensure you include:
->
-> ```bash
-> export no_proxy="localhost, 127.0.0.1, ::1"
-> export NO_PROXY="localhost, 127.0.0.1, ::1"
-> ```
->
-> Otherwise, Gradio may incorrectly assume the service hasn't started, causing startup to exit directly.
-When first running `app.py`, it will connect to Hugging Face to download MuQ-related weights. We recommend creating an empty folder in an appropriate location and using `export HF_HOME=XXX` to point to this folder, so cache will be stored there for easy cleanup and transfer.
-And for users in mainland China, you may need `export HF_ENDPOINT=https://hf-mirror.com`. For details, refer to https://hf-mirror.com/
-```bash
-python app.py
-```
-### 3. Python Code
-You can refer to the file `src/SongFormer/infer/infer.py`. The corresponding execution script is located at `src/SongFormer/infer.sh`. This is a ready-to-use, single-machine, multi-process annotation script.
-Below are some configurable parameters from the `src/SongFormer/infer.sh` script. You can set `CUDA_VISIBLE_DEVICES` to specify which GPUs to use:
-```bash
--i              # Input SCP folder path, each line containing the absolute path to one audio file
--o              # Output directory for annotation results
---model         # Annotation model; the default is 'SongFormer', change if using a fine-tuned model
---checkpoint    # Path to the model checkpoint file
---config_pat    # Path to the configuration file
--gn             # Total number of GPUs to use — should match the number specified in CUDA_VISIBLE_DEVICES
--tn             # Number of processes to run per GPU
-```
-You can control which GPUs are used by setting the `CUDA_VISIBLE_DEVICES` environment variable.
-### 4. CLI Inference
-Coming soon
-### 4. Pitfall
-- You may need to modify line 121 in `src/third_party/musicfm/model/musicfm_25hz.py` to:
-`S = torch.load(model_path, weights_only=False)["state_dict"]`
-## Training
 ## Citation
@@ -180,15 +59,4 @@ If our work and codebase is useful for you, please cite as:
 ````
 ## License
-Our code is released under CC-BY-4.0 License.
-## Contact Us
-<p align="center">
-    <a href="http://www.nwpu-aslp.org/">
-        <img src="figs/aslp.png" width="400"/>
-    </a>
-</p>

 ---
 <p align="center">
+  <img src="https://github.com/ASLP-lab/SongFormer/blob/main/figs/logo.png?raw=true" width="50%" />
 </p>
 Chunbo Hao<sup>&ast;</sup>, Ruibin Yuan<sup>&ast;</sup>, Jixun Yao, Qixin Deng, Xinyi Bai, Wei Xue, Lei Xie<sup>&dagger;</sup>
 ----
+**For more information, please visit our [github repository](https://github.com/ASLP-lab/SongFormer)**
 SongFormer is a music structure analysis framework that leverages multi-resolution self-supervised representations and heterogeneous supervision, accompanied by the large-scale multilingual dataset SongFormDB and the high-quality benchmark SongFormBench to foster fair and reproducible research.
+![](https://github.com/ASLP-lab/SongFormer/blob/main/figs/songformer.png?raw=true)
 ## Citation
 ````
 ## License
+Our code is released under CC-BY-4.0 License.