File size: 7,388 Bytes
78b8054
539ab2a
a189727
 
 
78b8054
1fd7b7d
a189727
78b8054
 
a189727
78b8054
 
a189727
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
---
title: vampnet
emoji: 🥗
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 5.28.0
python_version: 3.11
app_file: app.py
pinned: false
license: cc-by-nc-4.0
---

# VampNet

# Table of contents

- [setting up](#setting-up)
- [programmatic usage](#programmatic-usage)
- [launching the web app](#launching-the-web-app)
- [training / fine-tuning](#training--fine-tuning)
  - [training a model](#training-a-model)
  - [debugging training](#debugging-training)
  - [fine-tuning](#fine-tuning)
- [exporting your model](#exporting-your-model)
- [unloop](#unloop)
- [token telephone](#token-telephone)
- [a note on argbind](#a-note-on-argbind)
- [take a look at the pretrained models](#take-a-look-at-the-pretrained-models)
- [licensing for pretrained models](#licensing-for-pretrained-models)

## setting up

python 3.9-3.11 works well. (for example, using conda)
```bash
conda create -n vampnet python=3.9
conda activate vampnet
```

install VampNet

```bash
git clone https://github.com/hugofloresgarcia/vampnet.git
pip install -e ./vampnet
```

## programmatic usage

quick start!
```python
import random
import vampnet
import audiotools as at

# load the default vampnet model
interface = vampnet.interface.Interface.default()

# list available finetuned models
finetuned_model_choices = interface.available_models()
print(f"available finetuned models: {finetuned_model_choices}")

# pick a random finetuned model
model_choice = random.choice(finetuned_model_choices)
print(f"choosing model: {model_choice}")

# load a finetuned model
interface.load_finetuned(model_choice)

# load an example audio file
signal = at.AudioSignal("assets/example.wav")

# get the tokens for the audio
codes = interface.encode(signal)

# build a mask for the audio
mask = interface.build_mask(
    codes, signal,
    periodic_prompt=7, 
    upper_codebook_mask=3,
)

# generate the output tokens
output_tokens = interface.vamp(
    codes, mask, return_mask=False,
    temperature=1.0, 
    typical_filtering=True, 
)

# convert them to a signal
output_signal = interface.decode(output_tokens)

# save the output signal
output_signal.write("scratch/output.wav")
```


# Launching the Web app
You can launch a gradio UI to play with vampnet. 

```bash
python app.py 
```

# Training / Fine-tuning 

## Training a model

To train a model, run the following script: 

```bash
python scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints
```

for multi-gpu training, use torchrun:

```bash
torchrun --nproc_per_node gpu scripts/exp/train.py --args.load conf/vampnet.yml --save_path path/to/ckpt
```

You can edit `conf/vampnet.yml` to change the dataset paths or any training hyperparameters. 

For coarse2fine models, you can use `conf/c2f.yml` as a starting configuration. 

See `python scripts/exp/train.py -h` for a list of options.

## Debugging training

To debug training, it's easier to debug with 1 gpu and 0 workers

```bash
CUDA_VISIBLE_DEVICES=0 python -m pdb scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints --num_workers 0
```

# Fine-tuning

To fine-tune a model, use the script in `scripts/exp/fine_tune.py` 

for an audio folder
```bash
python scripts/exp/fine_tune.py /path/to/audio/folder <fine_tune_name>
```

for multiple files
```bash
python scripts/exp/fine_tune.py "/path/to/audio1.mp3 /path/to/audio2/ /path/to/audio3.wav" <fine_tune_name>
```

This creates configuration files for a fine tuning train job. The save_paths will be set to `runs/<fine_tune_name>/coarse` and `runs/<fine_tune_name>/c2f`. 

launch the coarse job: 
```bash
python scripts/exp/train.py --args.load conf/generated/<fine_tune_name>/coarse.yml 
```

this will save the coarse model to `runs/<fine_tune_name>/coarse/ckpt/best/`.

launch the c2f job: 
```bash
python  scripts/exp/train.py --args.load conf/generated/<fine_tune_name>/c2f.yml 
```

# Resuming a Training/Finetuning Job from checkpoint. 

To resume from checkpoint, use the `--resume` flag and the `--save_path` to point to the checkpoint you want to resume from.
```bash
python scripts/exp/train.py --args.load conf/generated/steve/coarse.yml --save_path runs/steve/coarse --resume
```

# Exporting your model

Once your model has been fine-tuned, you can export it to a HuggingFace model. 

In order to use your model in `app.py`, you will need to export it to HuggingFace.

**NOTE**: In order to export, you will need a [huggingface account](https://huggingface.co/).

Now, log in to huggingface using the command line:
```bash
huggingface-cli login
```

replace the contents of the file named `./DEFAULT_HF_MODEL_REPO` with your `<HUGGINGFACE_USERNAME>/vampnet`. A model repo will be automatically created for you with `export.py`. The default is `hugggof/vampnet`. 

for example, if my username is `hugggof`, I would run the following command:`
```bash
echo 'hugggof/vampnet' > ./DEFAULT_HF_MODEL_REPO
```

Now, run the following command to export your model (replace `<your_finetuned_model_name>` with the name of your model):

```bash
python scripts/exp/export.py --name <your_finetuned_model_name> --model latest
```

Once that's done, your model should appear on the list of available models in the gradio interface.
Simply run `python app.py` and select your model from the dropdown list.


# Unloop

Make sure you have Max installed on your laptop!

**NOTE**: To run unloop (with a GPU-powered server), you will need to install the vampnet repo in both your local machine and your GPU server.

## start a vampnet gradio server

First, **on your GPU server**, run the gradio server:
```bash
python app.py --args.load conf/interface.yml --Interface.device cuda
```
This will run a vampnet gradio API on your GPU server. Copy the address. It will be something like `https://127.0.0.1:7860/`. 

**IMPORTANT** Make sure that this gradio port (by default `7860`) is forwarded to your local machine, where you have Max installed. 

## start the unloop gradio client
Now, **on your local machine**, run the unloop gradio client.
```
cd unloop
pip install -r requirements.txt
python client.py --vampnet_url https://127.0.0.1:7860/ # replace with your gradio server address
```
This will start a gradio client that connects to the gradio server running on your GPU server.

## start the unloop Max patch
Now, open the unloop Max patch. It's located at `unloop/max/unloop.maxpat`.

In the tape controls, check the heartbeat (`<3`) to make sure the connection to the local gradio client is working. 

have fun!

# Token Telephone

Instructions forthcoming, but the sauce is in `token_telephone/tt.py`

## A note on argbind
This repository relies on [argbind](https://github.com/pseeth/argbind) to manage CLIs and config files. 
Config files are stored in the `conf/` folder. 

### Take a look at the pretrained models
All the pretrained models (trained by hugo) are stored here: https://huggingface.co/hugggof/vampnet 

### Licensing for Pretrained Models: 
The weights for the models are licensed [`CC BY-NC-SA 4.0`](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.ml). Likewise, any VampNet models fine-tuned on the pretrained models are also licensed [`CC BY-NC-SA 4.0`](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.ml).

Download the pretrained models from [this link](https://zenodo.org/record/8136629). Then, extract the models to the `models/` folder.