Spaces:
Build error
Build error
File size: 10,318 Bytes
c506ac5 658c07f c506ac5 55b1103 c506ac5 8f9851c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 |
---
title: Magenta RT API
emoji: π΅
colorFrom: purple
colorTo: blue
sdk: docker
pinned: false
app_file: main.py
app_port: 7860
hardware: l4x1
---
# π΅ Magenta RT API
A production-ready REST API for music generation using Google's [Magenta RealTime](https://github.com/magenta/magenta-realtime) model. Generate high-quality music from text descriptions or audio style references.
## π Quick Start
### Interactive Documentation
Once the Space is running, visit:
- **Swagger UI**: `/docs` - Interactive API testing
- **ReDoc**: `/redoc` - Clean API documentation
### Example: Generate Music from Text
```bash
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/text" \
-F "prompt=funky jazz groove with syncopated bass" \
-F "duration=10" \
--output generated.mp3
```
### Example: Style Transfer from Audio
```bash
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/audio" \
-F "audio_file=@reference.mp3" \
-F "duration=15" \
-F "prompt=add electronic elements" \
--output styled.mp3
```
### Example: Blend Multiple Styles
```bash
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/blend" \
-F "text_prompts=funk,ambient,jazz" \
-F "text_weights=2.0,1.5,1.0" \
-F "duration=20" \
--output blended.mp3
```
## π Features
- πΌ **Text-to-Music Generation** - Create music from natural language descriptions
- π§ **Audio Style Transfer** - Use existing audio as a style reference
- ποΈ **Multi-Prompt Blending** - Combine multiple text and audio prompts with custom weights
- π **REST API** - Easy integration with any programming language
- β‘ **GPU Accelerated** - Fast generation on NVIDIA A100 GPUs
- π **Built-in Monitoring** - Prometheus metrics endpoint
- π **Security Features** - Optional API key authentication and rate limiting
## π― API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/generate/text` | POST | Generate music from text prompt |
| `/generate/audio` | POST | Generate music from audio file reference |
| `/generate/blend` | POST | Blend multiple text/audio prompts |
| `/embed/text` | POST | Get MusicCoCa embedding for text |
| `/embed/audio` | POST | Get MusicCoCa embedding for audio |
| `/health` | GET | Health check endpoint |
| `/metrics` | GET | Prometheus metrics |
## π» Usage Examples
### Python
```python
import requests
def generate_music(prompt: str, duration: int = 10):
url = "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/text"
response = requests.post(
url,
data={
"prompt": prompt,
"duration": duration
}
)
if response.status_code == 200:
with open("output.mp3", "wb") as f:
f.write(response.content)
print("β
Music generated successfully!")
else:
print(f"β Error: {response.status_code}")
# Generate music
generate_music("epic orchestral cinematic soundtrack", duration=15)
```
### JavaScript
```javascript
async function generateMusic(prompt, duration = 10) {
const formData = new FormData();
formData.append('prompt', prompt);
formData.append('duration', duration);
const response = await fetch(
'https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/text',
{
method: 'POST',
body: formData
}
);
if (response.ok) {
const blob = await response.blob();
const url = URL.createObjectURL(blob);
// Play audio
const audio = new Audio(url);
audio.play();
// Or download
const a = document.createElement('a');
a.href = url;
a.download = 'generated.mp3';
a.click();
}
}
// Usage
generateMusic('dreamy ambient soundscape');
```
### cURL with Audio File
```bash
# Generate music styled after an audio reference
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/audio" \
-F "audio_file=@my_favorite_song.mp3" \
-F "duration=20" \
-F "prompt=more energetic and upbeat" \
-H "X-API-Key: your-api-key-if-enabled" \
--output new_version.mp3
```
## π¨ Prompt Examples
### Text Prompts
Great prompts are descriptive and combine multiple elements:
- `"funky jazz with syncopated bass and smooth saxophone"`
- `"ambient electronic soundscape with ethereal pads"`
- `"upbeat dance music with driving four-on-the-floor beat"`
- `"classical piano composition in the style of Chopin"`
- `"heavy metal with distorted guitars and double bass drums"`
- `"lo-fi hip hop beats with vinyl crackle and jazz samples"`
- `"epic orchestral cinematic soundtrack with strings and brass"`
- `"tropical house with steel drums and marimba"`
- `"psychedelic rock with reverb-drenched guitars"`
- `"90s eurodance with synth leads and energetic vocals"`
### Blending Prompts
Combine different styles with custom weights:
```bash
# 60% funk, 30% ambient, 10% jazz
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/blend" \
-F "text_prompts=funk groove,ambient soundscape,smooth jazz" \
-F "text_weights=3.0,1.5,0.5" \
-F "duration=15"
```
## βοΈ Configuration
### Environment Variables
Configure the API by setting environment variables in your Space settings:
```bash
# Authentication (optional)
ENABLE_AUTH=true
API_KEYS=key1,key2,key3
# Rate Limiting
RATE_LIMIT_PER_MINUTE=10
RATE_LIMIT_PER_HOUR=100
# Generation Limits
MAX_DURATION=120
MAX_FILE_SIZE_MB=50
# Monitoring
ENABLE_METRICS=true
LOG_LEVEL=INFO
```
### Using API Keys
If authentication is enabled, include your API key in requests:
```bash
curl -X POST "https://YOUR_USERNAME-magenta-rt-api.hf.space/generate/text" \
-H "X-API-Key: your-api-key" \
-F "prompt=ambient music" \
-F "duration=10"
```
## π Performance
### Generation Speed (on A100 40GB)
| Duration | Generation Time | Ratio to Realtime |
|----------|----------------|-------------------|
| 2s | ~5s | 2.5x |
| 10s | ~15s | 1.5x |
| 30s | ~40s | 1.3x |
| 60s | ~75s | 1.25x |
**Note**: First request includes model loading time (~30-60 seconds)
### Resource Usage
- **GPU Memory**: 12-15GB during inference
- **Model Size**: ~8GB on disk
- **Recommended Hardware**: A100 40GB or larger
## π§ Troubleshooting
### Space is Starting
First startup takes **10-15 minutes** to:
1. Build Docker container (~5 min)
2. Download Magenta RT models (~5-10 min)
3. Initialize models (~2-3 min)
Check the Space logs to monitor progress.
### Generation is Slow
- **First request**: Includes model loading (~30-60s)
- **Subsequent requests**: ~1.5x realtime
- **Solution**: Keep the Space active or use Persistent Storage
### Out of Memory Error
- Upgrade to **A100 80GB** hardware
- Reduce `MAX_DURATION` limit
- Process requests sequentially
### Connection Timeout
Increase your client timeout for long generations:
```python
response = requests.post(url, data=data, timeout=300) # 5 minutes
```
## π‘οΈ Rate Limits & Quotas
Default limits (configurable):
- **10 requests per minute** per IP address
- **100 requests per hour** per IP address
- **Maximum duration**: 120 seconds per generation
- **Maximum file size**: 50MB for audio uploads
## π Technical Details
### Model Information
- **Base Model**: Magenta RealTime (Google Research)
- **Architecture**: Transformer-based audio generation
- **Audio Codec**: SpectroStream (discrete codec, 48kHz stereo)
- **Style Encoder**: MusicCoCa (joint text-audio embeddings)
- **Context Length**: 10 seconds
- **Chunk Size**: 2 seconds (with crossfading)
### Supported Audio Formats
**Input** (for audio style transfer):
- MP3, WAV, OGG, FLAC, M4A
**Output**:
- MP3 (high-quality, stereo, 48kHz)
## π Resources
- **Magenta RT GitHub**: https://github.com/magenta/magenta-realtime
- **Research Paper**: https://arxiv.org/abs/2508.04651
- **Blog Post**: https://g.co/magenta/rt
- **Model Card**: https://github.com/magenta/magenta-realtime/blob/main/MODEL.md
## π€ Contributing
Found a bug or want to contribute?
- Report issues on the [GitHub repository](https://github.com/magenta/magenta-realtime/issues)
- Submit pull requests with improvements
- Share your generated music and prompts!
## π License
This API service is built on Magenta RealTime:
- **API Code**: Apache 2.0 License
- **Magenta RT Code**: Apache 2.0 License
- **Model Weights**: Creative Commons Attribution 4.0 International (CC-BY 4.0)
### Usage Terms
Copyright 2025 Google LLC
**You must**:
- Use responsibly and ethically
- Not generate content that infringes on others' rights
- Not generate copyrighted content without permission
**Google claims no rights** in outputs you generate using this API. You and your users are solely responsible for outputs and their subsequent uses.
See the [full license](https://github.com/magenta/magenta-realtime/blob/main/LICENSE) for details.
## π Citation
If you use this API in research, please cite:
```bibtex
@article{gdmlyria2025live,
title={Live Music Models},
author={Caillon, Antoine and McWilliams, Brian and Tarakajian, Cassie and Simon, Ian and Manco, Ilaria and Engel, Jesse and Constant, Noah and Li, Pen and Denk, Timo I. and Lalama, Alberto and Agostinelli, Andrea and Huang, Anna and Manilow, Ethan and Brower, George and Erdogan, Hakan and Lei, Heidi and Rolnick, Itai and Grishchenko, Ivan and Orsini, Manu and Kastelic, Matej and Zuluaga, Mauricio and Verzetti, Mauro and Dooley, Michael and Skopek, Ondrej and Ferrer, Rafael and Borsos, ZalΓ‘n and van den Oord, AΓ€ron and Eck, Douglas and Collins, Eli and Baldridge, Jason and Hume, Tom and Donahue, Chris and Han, Kehang and Roberts, Adam},
journal={arXiv:2508.04651},
year={2025}
}
```
## π¬ Support
For questions and support:
- π Check the `/docs` endpoint for detailed API documentation
- π Report bugs on [GitHub Issues](https://github.com/magenta/magenta-realtime/issues)
- π‘ Join discussions in the HuggingFace community
- π§ Contact the Space owner for deployment-specific questions
## π Acknowledgments
Built with:
- [Magenta RealTime](https://github.com/magenta/magenta-realtime) by Google Research
- [FastAPI](https://fastapi.tiangolo.com/) for the API framework
- [HuggingFace Spaces](https://huggingface.co/spaces) for hosting
---
**Ready to create amazing music?** π΅ Visit `/docs` to get started!
|