File size: 10,318 Bytes
c506ac5
 
 
 
 
 
 
658c07f
c506ac5
55b1103
c506ac5
 
8f9851c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
---
title: Magenta RT API
emoji: 🎡
colorFrom: purple
colorTo: blue
sdk: docker
pinned: false
app_file: main.py
app_port: 7860
hardware: l4x1
---

# 🎡 Magenta RT API

A production-ready REST API for music generation using Google's [Magenta RealTime](https://github.com/magenta/magenta-realtime) model. Generate high-quality music from text descriptions or audio style references.

## πŸš€ Quick Start

### Interactive Documentation

Once the Space is running, visit:
- **Swagger UI**: `/docs` - Interactive API testing
- **ReDoc**: `/redoc` - Clean API documentation

### Example: Generate Music from Text

```bash
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/text" \
  -F "prompt=funky jazz groove with syncopated bass" \
  -F "duration=10" \
  --output generated.mp3
```

### Example: Style Transfer from Audio

```bash
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/audio" \
  -F "audio_file=@reference.mp3" \
  -F "duration=15" \
  -F "prompt=add electronic elements" \
  --output styled.mp3
```

### Example: Blend Multiple Styles

```bash
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/blend" \
  -F "text_prompts=funk,ambient,jazz" \
  -F "text_weights=2.0,1.5,1.0" \
  -F "duration=20" \
  --output blended.mp3
```

## πŸ“‹ Features

- 🎼 **Text-to-Music Generation** - Create music from natural language descriptions
- 🎧 **Audio Style Transfer** - Use existing audio as a style reference
- πŸŽ›οΈ **Multi-Prompt Blending** - Combine multiple text and audio prompts with custom weights
- πŸ”Œ **REST API** - Easy integration with any programming language
- ⚑ **GPU Accelerated** - Fast generation on NVIDIA A100 GPUs
- πŸ“Š **Built-in Monitoring** - Prometheus metrics endpoint
- πŸ”’ **Security Features** - Optional API key authentication and rate limiting

## 🎯 API Endpoints

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/generate/text` | POST | Generate music from text prompt |
| `/generate/audio` | POST | Generate music from audio file reference |
| `/generate/blend` | POST | Blend multiple text/audio prompts |
| `/embed/text` | POST | Get MusicCoCa embedding for text |
| `/embed/audio` | POST | Get MusicCoCa embedding for audio |
| `/health` | GET | Health check endpoint |
| `/metrics` | GET | Prometheus metrics |

## πŸ’» Usage Examples

### Python

```python
import requests

def generate_music(prompt: str, duration: int = 10):
    url = "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/text"
    
    response = requests.post(
        url,
        data={
            "prompt": prompt,
            "duration": duration
        }
    )
    
    if response.status_code == 200:
        with open("output.mp3", "wb") as f:
            f.write(response.content)
        print("βœ… Music generated successfully!")
    else:
        print(f"❌ Error: {response.status_code}")

# Generate music
generate_music("epic orchestral cinematic soundtrack", duration=15)
```

### JavaScript

```javascript
async function generateMusic(prompt, duration = 10) {
  const formData = new FormData();
  formData.append('prompt', prompt);
  formData.append('duration', duration);
  
  const response = await fetch(
    'https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/text',
    {
      method: 'POST',
      body: formData
    }
  );
  
  if (response.ok) {
    const blob = await response.blob();
    const url = URL.createObjectURL(blob);
    
    // Play audio
    const audio = new Audio(url);
    audio.play();
    
    // Or download
    const a = document.createElement('a');
    a.href = url;
    a.download = 'generated.mp3';
    a.click();
  }
}

// Usage
generateMusic('dreamy ambient soundscape');
```

### cURL with Audio File

```bash
# Generate music styled after an audio reference
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/audio" \
  -F "audio_file=@my_favorite_song.mp3" \
  -F "duration=20" \
  -F "prompt=more energetic and upbeat" \
  -H "X-API-Key: your-api-key-if-enabled" \
  --output new_version.mp3
```

## 🎨 Prompt Examples

### Text Prompts

Great prompts are descriptive and combine multiple elements:

- `"funky jazz with syncopated bass and smooth saxophone"`
- `"ambient electronic soundscape with ethereal pads"`
- `"upbeat dance music with driving four-on-the-floor beat"`
- `"classical piano composition in the style of Chopin"`
- `"heavy metal with distorted guitars and double bass drums"`
- `"lo-fi hip hop beats with vinyl crackle and jazz samples"`
- `"epic orchestral cinematic soundtrack with strings and brass"`
- `"tropical house with steel drums and marimba"`
- `"psychedelic rock with reverb-drenched guitars"`
- `"90s eurodance with synth leads and energetic vocals"`

### Blending Prompts

Combine different styles with custom weights:

```bash
# 60% funk, 30% ambient, 10% jazz
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/blend" \
  -F "text_prompts=funk groove,ambient soundscape,smooth jazz" \
  -F "text_weights=3.0,1.5,0.5" \
  -F "duration=15"
```

## βš™οΈ Configuration

### Environment Variables

Configure the API by setting environment variables in your Space settings:

```bash
# Authentication (optional)
ENABLE_AUTH=true
API_KEYS=key1,key2,key3

# Rate Limiting
RATE_LIMIT_PER_MINUTE=10
RATE_LIMIT_PER_HOUR=100

# Generation Limits
MAX_DURATION=120
MAX_FILE_SIZE_MB=50

# Monitoring
ENABLE_METRICS=true
LOG_LEVEL=INFO
```

### Using API Keys

If authentication is enabled, include your API key in requests:

```bash
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/text" \
  -H "X-API-Key: your-api-key" \
  -F "prompt=ambient music" \
  -F "duration=10"
```

## πŸ“Š Performance

### Generation Speed (on A100 40GB)

| Duration | Generation Time | Ratio to Realtime |
|----------|----------------|-------------------|
| 2s | ~5s | 2.5x |
| 10s | ~15s | 1.5x |
| 30s | ~40s | 1.3x |
| 60s | ~75s | 1.25x |

**Note**: First request includes model loading time (~30-60 seconds)

### Resource Usage

- **GPU Memory**: 12-15GB during inference
- **Model Size**: ~8GB on disk
- **Recommended Hardware**: A100 40GB or larger

## πŸ”§ Troubleshooting

### Space is Starting

First startup takes **10-15 minutes** to:
1. Build Docker container (~5 min)
2. Download Magenta RT models (~5-10 min)
3. Initialize models (~2-3 min)

Check the Space logs to monitor progress.

### Generation is Slow

- **First request**: Includes model loading (~30-60s)
- **Subsequent requests**: ~1.5x realtime
- **Solution**: Keep the Space active or use Persistent Storage

### Out of Memory Error

- Upgrade to **A100 80GB** hardware
- Reduce `MAX_DURATION` limit
- Process requests sequentially

### Connection Timeout

Increase your client timeout for long generations:

```python
response = requests.post(url, data=data, timeout=300)  # 5 minutes
```

## πŸ›‘οΈ Rate Limits & Quotas

Default limits (configurable):

- **10 requests per minute** per IP address
- **100 requests per hour** per IP address
- **Maximum duration**: 120 seconds per generation
- **Maximum file size**: 50MB for audio uploads

## πŸ“š Technical Details

### Model Information

- **Base Model**: Magenta RealTime (Google Research)
- **Architecture**: Transformer-based audio generation
- **Audio Codec**: SpectroStream (discrete codec, 48kHz stereo)
- **Style Encoder**: MusicCoCa (joint text-audio embeddings)
- **Context Length**: 10 seconds
- **Chunk Size**: 2 seconds (with crossfading)

### Supported Audio Formats

**Input** (for audio style transfer):
- MP3, WAV, OGG, FLAC, M4A

**Output**:
- MP3 (high-quality, stereo, 48kHz)

## πŸ”— Resources

- **Magenta RT GitHub**: https://github.com/magenta/magenta-realtime
- **Research Paper**: https://arxiv.org/abs/2508.04651
- **Blog Post**: https://g.co/magenta/rt
- **Model Card**: https://github.com/magenta/magenta-realtime/blob/main/MODEL.md

## 🀝 Contributing

Found a bug or want to contribute? 

- Report issues on the [GitHub repository](https://github.com/magenta/magenta-realtime/issues)
- Submit pull requests with improvements
- Share your generated music and prompts!

## πŸ“„ License

This API service is built on Magenta RealTime:

- **API Code**: Apache 2.0 License
- **Magenta RT Code**: Apache 2.0 License
- **Model Weights**: Creative Commons Attribution 4.0 International (CC-BY 4.0)

### Usage Terms

Copyright 2025 Google LLC

**You must**:
- Use responsibly and ethically
- Not generate content that infringes on others' rights
- Not generate copyrighted content without permission

**Google claims no rights** in outputs you generate using this API. You and your users are solely responsible for outputs and their subsequent uses.

See the [full license](https://github.com/magenta/magenta-realtime/blob/main/LICENSE) for details.

## πŸŽ“ Citation

If you use this API in research, please cite:

```bibtex
@article{gdmlyria2025live,
  title={Live Music Models},
  author={Caillon, Antoine and McWilliams, Brian and Tarakajian, Cassie and Simon, Ian and Manco, Ilaria and Engel, Jesse and Constant, Noah and Li, Pen and Denk, Timo I. and Lalama, Alberto and Agostinelli, Andrea and Huang, Anna and Manilow, Ethan and Brower, George and Erdogan, Hakan and Lei, Heidi and Rolnick, Itai and Grishchenko, Ivan and Orsini, Manu and Kastelic, Matej and Zuluaga, Mauricio and Verzetti, Mauro and Dooley, Michael and Skopek, Ondrej and Ferrer, Rafael and Borsos, ZalΓ‘n and van den Oord, AΓ€ron and Eck, Douglas and Collins, Eli and Baldridge, Jason and Hume, Tom and Donahue, Chris and Han, Kehang and Roberts, Adam},
  journal={arXiv:2508.04651},
  year={2025}
}
```

## πŸ’¬ Support

For questions and support:

- πŸ“– Check the `/docs` endpoint for detailed API documentation
- πŸ› Report bugs on [GitHub Issues](https://github.com/magenta/magenta-realtime/issues)
- πŸ’‘ Join discussions in the HuggingFace community
- πŸ“§ Contact the Space owner for deployment-specific questions

## πŸŽ‰ Acknowledgments

Built with:
- [Magenta RealTime](https://github.com/magenta/magenta-realtime) by Google Research
- [FastAPI](https://fastapi.tiangolo.com/) for the API framework
- [HuggingFace Spaces](https://huggingface.co/spaces) for hosting

---

**Ready to create amazing music?** 🎡 Visit `/docs` to get started!