Spaces:
Runtime error
Runtime error
Commit
·
2f1f8ac
1
Parent(s):
5c11fb8
Deploy PLONK with 32 samples and uncertainty estimation
Browse files- Updated to use 32 samples for better prediction accuracy
- Added uncertainty estimation (±km radius)
- Enhanced API responses with sample count and confidence
- Configuration: CFG=2.0, 32 samples, 32 timesteps
- Ready for production deployment with robust predictions
- README.md +102 -45
- app.py +333 -73
- requirements_hf_spaces.txt +13 -0
README.md
CHANGED
|
@@ -1,78 +1,135 @@
|
|
| 1 |
---
|
| 2 |
title: PLONK Geolocation
|
| 3 |
emoji: 🗺️
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
sdk: gradio
|
| 7 |
-
sdk_version:
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
# 🗺️ PLONK: Around the World in 80 Timesteps
|
| 14 |
|
| 15 |
-
A generative approach to global visual geolocation. Upload an image and PLONK will predict where it was taken!
|
| 16 |
|
| 17 |
-
##
|
| 18 |
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
-
##
|
| 22 |
|
| 23 |
-
|
| 24 |
-
- **Advanced Analysis**: Explore prediction uncertainty with multiple samples and guidance control
|
| 25 |
-
- **Fast CPU Inference**: ~300-500ms per image on CPU-Basic tier
|
| 26 |
-
- **GPU Ready**: Upgrade to T4-small for ~45ms inference time
|
| 27 |
|
| 28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
|
|
|
| 40 |
```python
|
| 41 |
-
import
|
| 42 |
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
print(response.json())
|
| 47 |
```
|
| 48 |
|
| 49 |
-
## Model
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
-
|
| 52 |
-
- **Memory**: <1GB RAM usage
|
| 53 |
-
- **Throughput**: ~10 req/s on T4 before saturation
|
| 54 |
|
| 55 |
-
|
| 56 |
|
| 57 |
-
|
| 58 |
-
- **T4-small ($0.40/hr)**: 10x faster inference for production
|
| 59 |
-
- **Inference Endpoints**: Auto-scaling with pay-per-use pricing
|
| 60 |
|
| 61 |
-
|
| 62 |
|
| 63 |
-
|
| 64 |
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
}
|
| 72 |
```
|
| 73 |
|
| 74 |
-
|
|
|
|
|
|
|
| 75 |
|
| 76 |
-
|
| 77 |
-
- 💻 [Code Repository](https://github.com/nicolas-dufour/plonk)
|
| 78 |
-
- 🤗 [Model on Hugging Face](https://huggingface.co/nicolas-dufour/PLONK_YFCC)
|
|
|
|
| 1 |
---
|
| 2 |
title: PLONK Geolocation
|
| 3 |
emoji: 🗺️
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: green
|
| 6 |
sdk: gradio
|
| 7 |
+
sdk_version: 4.44.0
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
+
short_description: Around the World in 80 Timesteps - Generative Visual Geolocation
|
| 12 |
---
|
| 13 |
|
| 14 |
# 🗺️ PLONK: Around the World in 80 Timesteps
|
| 15 |
|
| 16 |
+
A generative approach to global visual geolocation using diffusion models. Upload an image and PLONK will predict where it was taken!
|
| 17 |
|
| 18 |
+
## 🚀 Features
|
| 19 |
|
| 20 |
+
- **High-Quality Predictions**: Uses 32 samples with CFG=2.0 for robust geolocation
|
| 21 |
+
- **Uncertainty Estimation**: Provides confidence radius (±km) for each prediction
|
| 22 |
+
- **REST API**: Full programmatic access with JSON responses
|
| 23 |
+
- **Multiple Input Methods**: File upload, webcam, clipboard, or base64 encoding
|
| 24 |
+
- **CORS Enabled**: Ready for web integration
|
| 25 |
|
| 26 |
+
## 📡 API Usage
|
| 27 |
|
| 28 |
+
### REST API Endpoints
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
+
**Main Prediction:**
|
| 31 |
+
```
|
| 32 |
+
POST https://kylanoconnor-plonk-geolocation.hf.space/api/predict
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
**JSON Response:**
|
| 36 |
+
```
|
| 37 |
+
POST https://kylanoconnor-plonk-geolocation.hf.space/api/predict_json
|
| 38 |
+
```
|
| 39 |
|
| 40 |
+
### Python Example
|
| 41 |
+
```python
|
| 42 |
+
import requests
|
| 43 |
+
|
| 44 |
+
# Upload image file
|
| 45 |
+
response = requests.post(
|
| 46 |
+
"https://kylanoconnor-plonk-geolocation.hf.space/api/predict",
|
| 47 |
+
files={"file": open("image.jpg", "rb")}
|
| 48 |
+
)
|
| 49 |
+
result = response.json()
|
| 50 |
+
print(f"Location: {result['data']['latitude']}, {result['data']['longitude']}")
|
| 51 |
+
print(f"Uncertainty: ±{result['data']['uncertainty_km']} km")
|
| 52 |
+
```
|
| 53 |
|
| 54 |
+
### cURL Example
|
| 55 |
+
```bash
|
| 56 |
+
curl -X POST \
|
| 57 |
+
-F "data=@image.jpg" \
|
| 58 |
+
"https://kylanoconnor-plonk-geolocation.hf.space/api/predict"
|
| 59 |
+
```
|
| 60 |
|
| 61 |
+
### JavaScript/Node.js
|
| 62 |
+
```javascript
|
| 63 |
+
const formData = new FormData();
|
| 64 |
+
formData.append('data', imageFile);
|
| 65 |
+
|
| 66 |
+
const response = await fetch(
|
| 67 |
+
'https://kylanoconnor-plonk-geolocation.hf.space/api/predict',
|
| 68 |
+
{
|
| 69 |
+
method: 'POST',
|
| 70 |
+
body: formData
|
| 71 |
+
}
|
| 72 |
+
);
|
| 73 |
+
|
| 74 |
+
const result = await response.json();
|
| 75 |
+
console.log('Location:', result.data);
|
| 76 |
+
```
|
| 77 |
|
| 78 |
+
### Gradio Client (Python)
|
| 79 |
```python
|
| 80 |
+
from gradio_client import Client
|
| 81 |
|
| 82 |
+
client = Client("kylanoconnor/plonk-geolocation")
|
| 83 |
+
result = client.predict("path/to/image.jpg", api_name="/predict")
|
| 84 |
+
print(result)
|
|
|
|
| 85 |
```
|
| 86 |
|
| 87 |
+
## 🎯 Model Configuration
|
| 88 |
+
|
| 89 |
+
- **Model**: nicolas-dufour/PLONK_YFCC
|
| 90 |
+
- **Dataset**: YFCC-100M
|
| 91 |
+
- **Samples**: 32 (for uncertainty estimation)
|
| 92 |
+
- **Guidance Scale**: 2.0
|
| 93 |
+
- **Timesteps**: 32
|
| 94 |
+
- **Uncertainty**: Statistical analysis across predictions
|
| 95 |
+
|
| 96 |
+
## 📊 Response Format
|
| 97 |
+
|
| 98 |
+
```json
|
| 99 |
+
{
|
| 100 |
+
"status": "success",
|
| 101 |
+
"mode": "production",
|
| 102 |
+
"predicted_location": {
|
| 103 |
+
"latitude": 40.756123,
|
| 104 |
+
"longitude": -73.984567
|
| 105 |
+
},
|
| 106 |
+
"confidence": "high",
|
| 107 |
+
"samples": 32,
|
| 108 |
+
"uncertainty_km": 12.3,
|
| 109 |
+
"note": "Real PLONK prediction using 32 samples"
|
| 110 |
+
}
|
| 111 |
+
```
|
| 112 |
|
| 113 |
+
## 📚 About
|
|
|
|
|
|
|
| 114 |
|
| 115 |
+
**Paper**: [Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation](https://arxiv.org/abs/2412.06781)
|
| 116 |
|
| 117 |
+
**Authors**: Nicolas Dufour, David Picard, Vicky Kalogeiton, Loic Landrieu
|
|
|
|
|
|
|
| 118 |
|
| 119 |
+
**Original Code**: https://github.com/nicolas-dufour/plonk
|
| 120 |
|
| 121 |
+
This Space provides both a user-friendly web interface and robust API access for global visual geolocation using the PLONK model. The model uses 32 samples per prediction to provide uncertainty estimation and more reliable results.
|
| 122 |
|
| 123 |
+
## 🔧 Development
|
| 124 |
+
|
| 125 |
+
To run locally:
|
| 126 |
+
```bash
|
| 127 |
+
pip install -r requirements_hf_spaces.txt
|
| 128 |
+
python app.py
|
|
|
|
| 129 |
```
|
| 130 |
|
| 131 |
+
The app will be available at `http://localhost:7860` with API documentation at `/docs`.
|
| 132 |
+
|
| 133 |
+
---
|
| 134 |
|
| 135 |
+
*Built with ❤️ using Gradio and Hugging Face Spaces*
|
|
|
|
|
|
app.py
CHANGED
|
@@ -1,107 +1,367 @@
|
|
| 1 |
import gradio as gr
|
| 2 |
import torch
|
| 3 |
-
|
| 4 |
-
import numpy as np
|
| 5 |
from PIL import Image
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
from pathlib import Path
|
|
|
|
|
|
|
| 7 |
|
| 8 |
-
#
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
|
| 13 |
-
def
|
| 14 |
"""
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
image: PIL Image
|
| 18 |
-
Returns:
|
| 19 |
-
str: Formatted latitude and longitude
|
| 20 |
"""
|
| 21 |
-
|
| 22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
try:
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
predicted_gps = pipe(image, batch_size=1, cfg=2.0, num_steps=32)
|
| 28 |
|
| 29 |
-
#
|
| 30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
# Format the result
|
| 33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
return result
|
| 36 |
|
| 37 |
except Exception as e:
|
| 38 |
-
return f"Error
|
| 39 |
|
| 40 |
-
def
|
| 41 |
"""
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
image: PIL Image
|
| 45 |
-
num_samples: Number of samples to generate
|
| 46 |
-
cfg: Classifier-free guidance scale
|
| 47 |
-
Returns:
|
| 48 |
-
str: Formatted results with statistics
|
| 49 |
"""
|
| 50 |
-
if image is None:
|
| 51 |
-
return "Please upload an image"
|
| 52 |
-
|
| 53 |
try:
|
| 54 |
-
|
| 55 |
-
|
|
|
|
|
|
|
|
|
|
| 56 |
|
| 57 |
-
#
|
| 58 |
-
|
| 59 |
-
|
| 60 |
|
| 61 |
-
|
| 62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
Latitude: {conf_lat:.6f}
|
| 72 |
-
Longitude: {conf_lon:.6f}
|
| 73 |
-
|
| 74 |
-
Sample Statistics ({num_samples} samples, CFG={cfg}):
|
| 75 |
-
Mean Latitude: {mean_lat:.6f} ± {std_lat:.6f}
|
| 76 |
-
Mean Longitude: {mean_lon:.6f} ± {std_lon:.6f}
|
| 77 |
-
"""
|
| 78 |
|
| 79 |
return result
|
| 80 |
|
| 81 |
except Exception as e:
|
| 82 |
-
return
|
|
|
|
|
|
|
|
|
|
| 83 |
|
| 84 |
-
# Create the
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
|
|
|
|
|
|
|
|
|
| 91 |
A generative approach to global visual geolocation. Upload an image and PLONK will predict where it was taken!
|
| 92 |
|
| 93 |
-
This uses the
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
**
|
| 97 |
-
"""
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
|
| 106 |
if __name__ == "__main__":
|
| 107 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
import gradio as gr
|
| 2 |
import torch
|
| 3 |
+
import torchvision.transforms as transforms
|
|
|
|
| 4 |
from PIL import Image
|
| 5 |
+
import base64
|
| 6 |
+
import io
|
| 7 |
+
import os
|
| 8 |
+
import numpy as np
|
| 9 |
from pathlib import Path
|
| 10 |
+
from plonk.pipe import PlonkPipeline
|
| 11 |
+
import random
|
| 12 |
|
| 13 |
+
# Global variable to store the model
|
| 14 |
+
model = None
|
| 15 |
+
|
| 16 |
+
# Real PLONK predictions for production deployment
|
| 17 |
+
MOCK_MODE = False # Set to True for testing with mock data
|
| 18 |
+
|
| 19 |
+
def load_plonk_model():
|
| 20 |
+
"""
|
| 21 |
+
Load the PLONK model.
|
| 22 |
+
"""
|
| 23 |
+
global model
|
| 24 |
+
if model is None:
|
| 25 |
+
print("Loading PLONK_YFCC model...")
|
| 26 |
+
model = PlonkPipeline(model_path="nicolas-dufour/PLONK_YFCC")
|
| 27 |
+
print("Model loaded successfully!")
|
| 28 |
+
return model
|
| 29 |
|
| 30 |
+
def mock_plonk_prediction():
|
| 31 |
"""
|
| 32 |
+
Mock PLONK prediction - returns realistic coordinates
|
| 33 |
+
Used only when MOCK_MODE = True
|
|
|
|
|
|
|
|
|
|
| 34 |
"""
|
| 35 |
+
# Sample realistic coordinates from major cities/regions
|
| 36 |
+
mock_locations = [
|
| 37 |
+
(40.7128, -74.0060), # New York
|
| 38 |
+
(34.0522, -118.2437), # Los Angeles
|
| 39 |
+
(51.5074, -0.1278), # London
|
| 40 |
+
(48.8566, 2.3522), # Paris
|
| 41 |
+
(35.6762, 139.6503), # Tokyo
|
| 42 |
+
(37.7749, -122.4194), # San Francisco
|
| 43 |
+
(41.8781, -87.6298), # Chicago
|
| 44 |
+
(25.7617, -80.1918), # Miami
|
| 45 |
+
(45.5017, -73.5673), # Montreal
|
| 46 |
+
(52.5200, 13.4050), # Berlin
|
| 47 |
+
(-33.8688, 151.2093), # Sydney
|
| 48 |
+
(19.4326, -99.1332), # Mexico City
|
| 49 |
+
]
|
| 50 |
+
|
| 51 |
+
# Add some randomness to make it more realistic
|
| 52 |
+
base_lat, base_lon = random.choice(mock_locations)
|
| 53 |
+
lat = base_lat + random.uniform(-2, 2) # Add noise within ~200km
|
| 54 |
+
lon = base_lon + random.uniform(-2, 2)
|
| 55 |
+
|
| 56 |
+
return lat, lon
|
| 57 |
+
|
| 58 |
+
def real_plonk_prediction(image):
|
| 59 |
+
"""
|
| 60 |
+
Real PLONK prediction using the diff-plonk package
|
| 61 |
+
Now generates 32 samples for better uncertainty estimation
|
| 62 |
+
"""
|
| 63 |
+
from plonk.pipe import PlonkPipeline
|
| 64 |
+
import numpy as np
|
| 65 |
+
|
| 66 |
+
# Load the model (do this once at startup, not per request)
|
| 67 |
+
if not hasattr(gr, 'plonk_pipeline'):
|
| 68 |
+
print("Loading PLONK model...")
|
| 69 |
+
gr.plonk_pipeline = PlonkPipeline(model_path="nicolas-dufour/PLONK_YFCC")
|
| 70 |
+
print("PLONK model loaded successfully!")
|
| 71 |
+
|
| 72 |
+
# Get 32 predictions for uncertainty estimation
|
| 73 |
+
predicted_gps = gr.plonk_pipeline(image, batch_size=32, cfg=2.0, num_steps=32)
|
| 74 |
|
| 75 |
+
# Convert to numpy for easier processing
|
| 76 |
+
predictions = predicted_gps.cpu().numpy() # Shape: (32, 2)
|
| 77 |
+
|
| 78 |
+
# Calculate statistics
|
| 79 |
+
mean_lat = float(np.mean(predictions[:, 0]))
|
| 80 |
+
mean_lon = float(np.mean(predictions[:, 1]))
|
| 81 |
+
std_lat = float(np.std(predictions[:, 0]))
|
| 82 |
+
std_lon = float(np.std(predictions[:, 1]))
|
| 83 |
+
|
| 84 |
+
# Calculate uncertainty radius (approximate)
|
| 85 |
+
uncertainty_km = np.sqrt(std_lat**2 + std_lon**2) * 111.32 # Rough conversion to km
|
| 86 |
+
|
| 87 |
+
return mean_lat, mean_lon, uncertainty_km, len(predictions)
|
| 88 |
+
|
| 89 |
+
def predict_location(image):
|
| 90 |
+
"""
|
| 91 |
+
Main prediction function for Gradio interface
|
| 92 |
+
"""
|
| 93 |
try:
|
| 94 |
+
if image is None:
|
| 95 |
+
return "Please upload an image."
|
|
|
|
| 96 |
|
| 97 |
+
# Ensure RGB format
|
| 98 |
+
if image.mode != 'RGB':
|
| 99 |
+
image = image.convert('RGB')
|
| 100 |
+
|
| 101 |
+
# Get prediction (mock or real)
|
| 102 |
+
if MOCK_MODE:
|
| 103 |
+
lat, lon = mock_plonk_prediction()
|
| 104 |
+
confidence = "mock"
|
| 105 |
+
uncertainty_km = None
|
| 106 |
+
num_samples = 1
|
| 107 |
+
note = " (Mock prediction for testing)"
|
| 108 |
+
else:
|
| 109 |
+
lat, lon, uncertainty_km, num_samples = real_plonk_prediction(image)
|
| 110 |
+
confidence = "high"
|
| 111 |
+
note = f" (Real PLONK prediction, {num_samples} samples)"
|
| 112 |
|
| 113 |
# Format the result
|
| 114 |
+
uncertainty_text = f"\n**Uncertainty:** ±{uncertainty_km:.1f} km" if uncertainty_km is not None else ""
|
| 115 |
+
|
| 116 |
+
result = f"""🗺️ **Predicted Location**{note}
|
| 117 |
+
|
| 118 |
+
**Latitude:** {lat:.6f}
|
| 119 |
+
**Longitude:** {lon:.6f}{uncertainty_text}
|
| 120 |
+
|
| 121 |
+
**Confidence:** {confidence}
|
| 122 |
+
**Samples:** {num_samples}
|
| 123 |
+
**Mode:** {'🧪 Mock Testing' if MOCK_MODE else '🚀 Production'}
|
| 124 |
+
|
| 125 |
+
🌍 *This prediction estimates where the image was taken based on visual content.*
|
| 126 |
+
"""
|
| 127 |
|
| 128 |
return result
|
| 129 |
|
| 130 |
except Exception as e:
|
| 131 |
+
return f"❌ Error processing image: {str(e)}"
|
| 132 |
|
| 133 |
+
def predict_location_json(image):
|
| 134 |
"""
|
| 135 |
+
JSON API function for programmatic access
|
| 136 |
+
Returns structured data instead of formatted text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 137 |
"""
|
|
|
|
|
|
|
|
|
|
| 138 |
try:
|
| 139 |
+
if image is None:
|
| 140 |
+
return {
|
| 141 |
+
"error": "No image provided",
|
| 142 |
+
"status": "error"
|
| 143 |
+
}
|
| 144 |
|
| 145 |
+
# Ensure RGB format
|
| 146 |
+
if image.mode != 'RGB':
|
| 147 |
+
image = image.convert('RGB')
|
| 148 |
|
| 149 |
+
# Get prediction (mock or real)
|
| 150 |
+
if MOCK_MODE:
|
| 151 |
+
lat, lon = mock_plonk_prediction()
|
| 152 |
+
confidence = "mock"
|
| 153 |
+
uncertainty_km = None
|
| 154 |
+
num_samples = 1
|
| 155 |
+
else:
|
| 156 |
+
lat, lon, uncertainty_km, num_samples = real_plonk_prediction(image)
|
| 157 |
+
confidence = "high"
|
| 158 |
|
| 159 |
+
result = {
|
| 160 |
+
"status": "success",
|
| 161 |
+
"mode": "mock" if MOCK_MODE else "production",
|
| 162 |
+
"predicted_location": {
|
| 163 |
+
"latitude": round(lat, 6),
|
| 164 |
+
"longitude": round(lon, 6)
|
| 165 |
+
},
|
| 166 |
+
"confidence": confidence,
|
| 167 |
+
"samples": num_samples,
|
| 168 |
+
"note": "This is a mock prediction for testing" if MOCK_MODE else f"Real PLONK prediction using {num_samples} samples"
|
| 169 |
+
}
|
| 170 |
|
| 171 |
+
# Add uncertainty info if available
|
| 172 |
+
if uncertainty_km is not None:
|
| 173 |
+
result["uncertainty_km"] = round(uncertainty_km, 1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 174 |
|
| 175 |
return result
|
| 176 |
|
| 177 |
except Exception as e:
|
| 178 |
+
return {
|
| 179 |
+
"error": str(e),
|
| 180 |
+
"status": "error"
|
| 181 |
+
}
|
| 182 |
|
| 183 |
+
# Create the Gradio interface
|
| 184 |
+
with gr.Blocks(
|
| 185 |
+
theme=gr.themes.Soft(),
|
| 186 |
+
title="🗺️ PLONK: Around the World in 80 Timesteps"
|
| 187 |
+
) as demo:
|
| 188 |
+
|
| 189 |
+
# Header
|
| 190 |
+
gr.Markdown("""
|
| 191 |
+
# 🗺️ PLONK: Around the World in 80 Timesteps
|
| 192 |
+
|
| 193 |
A generative approach to global visual geolocation. Upload an image and PLONK will predict where it was taken!
|
| 194 |
|
| 195 |
+
This uses the PLONK model concept from the paper: *"Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation"*
|
| 196 |
+
|
| 197 |
+
**Current Mode:** {'🧪 Mock Testing' if MOCK_MODE else '🚀 Production'} - Real PLONK model predictions with 32 samples for uncertainty estimation.
|
| 198 |
+
**Configuration:** Guidance Scale = 2.0, Samples = 32, Steps = 32
|
| 199 |
+
""")
|
| 200 |
+
|
| 201 |
+
with gr.Tab("🖼️ Image Upload"):
|
| 202 |
+
with gr.Row():
|
| 203 |
+
with gr.Column(scale=1):
|
| 204 |
+
image_input = gr.Image(
|
| 205 |
+
label="Upload an image",
|
| 206 |
+
type="pil",
|
| 207 |
+
sources=["upload", "webcam", "clipboard"]
|
| 208 |
+
)
|
| 209 |
+
|
| 210 |
+
predict_btn = gr.Button(
|
| 211 |
+
"🔍 Predict Location",
|
| 212 |
+
variant="primary",
|
| 213 |
+
size="lg"
|
| 214 |
+
)
|
| 215 |
+
|
| 216 |
+
clear_btn = gr.ClearButton(
|
| 217 |
+
components=[image_input],
|
| 218 |
+
value="🗑️ Clear"
|
| 219 |
+
)
|
| 220 |
+
|
| 221 |
+
with gr.Column(scale=1):
|
| 222 |
+
output_text = gr.Markdown(
|
| 223 |
+
label="Prediction Result",
|
| 224 |
+
value="Upload an image and click 'Predict Location' to see results."
|
| 225 |
+
)
|
| 226 |
+
|
| 227 |
+
with gr.Tab("📡 API Information"):
|
| 228 |
+
gr.Markdown(f"""
|
| 229 |
+
## 🔗 API Access
|
| 230 |
+
|
| 231 |
+
This Space provides both web interface and programmatic API access:
|
| 232 |
+
|
| 233 |
+
### **REST API Endpoint**
|
| 234 |
+
```
|
| 235 |
+
POST https://kylanoconnor-plonk-geolocation.hf.space/api/predict
|
| 236 |
+
```
|
| 237 |
+
|
| 238 |
+
### **Python Example**
|
| 239 |
+
```python
|
| 240 |
+
import requests
|
| 241 |
+
|
| 242 |
+
# For API access
|
| 243 |
+
response = requests.post(
|
| 244 |
+
"https://kylanoconnor-plonk-geolocation.hf.space/api/predict",
|
| 245 |
+
files={{"file": open("image.jpg", "rb")}}
|
| 246 |
+
)
|
| 247 |
+
result = response.json()
|
| 248 |
+
print(f"Location: {{result['data']['latitude']}}, {{result['data']['longitude']}}")
|
| 249 |
+
```
|
| 250 |
+
|
| 251 |
+
### **cURL Example**
|
| 252 |
+
```bash
|
| 253 |
+
curl -X POST \\
|
| 254 |
+
-F "data=@image.jpg" \\
|
| 255 |
+
"https://kylanoconnor-plonk-geolocation.hf.space/api/predict"
|
| 256 |
+
```
|
| 257 |
+
|
| 258 |
+
### **Gradio Client (Python)**
|
| 259 |
+
```python
|
| 260 |
+
from gradio_client import Client
|
| 261 |
+
|
| 262 |
+
client = Client("kylanoconnor/plonk-geolocation")
|
| 263 |
+
result = client.predict("path/to/image.jpg", api_name="/predict")
|
| 264 |
+
print(result)
|
| 265 |
+
```
|
| 266 |
+
|
| 267 |
+
### **JavaScript/Node.js**
|
| 268 |
+
```javascript
|
| 269 |
+
const formData = new FormData();
|
| 270 |
+
formData.append('data', imageFile);
|
| 271 |
+
|
| 272 |
+
const response = await fetch(
|
| 273 |
+
'https://kylanoconnor-plonk-geolocation.hf.space/api/predict',
|
| 274 |
+
{{
|
| 275 |
+
method: 'POST',
|
| 276 |
+
body: formData
|
| 277 |
+
}}
|
| 278 |
+
);
|
| 279 |
+
|
| 280 |
+
const result = await response.json();
|
| 281 |
+
console.log('Location:', result.data);
|
| 282 |
+
```
|
| 283 |
+
|
| 284 |
+
**Current Status:** {'🧪 Mock Mode - Returns realistic test coordinates' if MOCK_MODE else '🚀 Production Mode - Real PLONK predictions with 32 samples'}
|
| 285 |
+
|
| 286 |
+
**Response Format:**
|
| 287 |
+
- Latitude/Longitude coordinates
|
| 288 |
+
- Uncertainty estimation (±km radius)
|
| 289 |
+
- Number of samples used (32 for production)
|
| 290 |
+
- Prediction confidence metrics
|
| 291 |
+
|
| 292 |
+
**Rate Limits:** Standard Hugging Face Spaces limits apply
|
| 293 |
+
|
| 294 |
+
**CORS:** Enabled for web integration
|
| 295 |
+
""")
|
| 296 |
+
|
| 297 |
+
with gr.Tab("ℹ️ About"):
|
| 298 |
+
gr.Markdown(f"""
|
| 299 |
+
## About PLONK
|
| 300 |
+
|
| 301 |
+
PLONK is a generative approach to global visual geolocation that uses diffusion models to predict where images were taken.
|
| 302 |
+
|
| 303 |
+
**Paper:** [Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation](https://arxiv.org/abs/2412.06781)
|
| 304 |
+
|
| 305 |
+
**Authors:** Nicolas Dufour, David Picard, Vicky Kalogeiton, Loic Landrieu
|
| 306 |
+
|
| 307 |
+
**Original Code:** https://github.com/nicolas-dufour/plonk
|
| 308 |
+
|
| 309 |
+
### Current Deployment
|
| 310 |
+
- **Mode:** {'Mock Testing' if MOCK_MODE else 'Production'}
|
| 311 |
+
- **Model:** {'Simulated predictions for API testing' if MOCK_MODE else 'Real PLONK model inference'}
|
| 312 |
+
- **Response Format:** Structured JSON + formatted text
|
| 313 |
+
- **API:** Fully functional REST endpoints
|
| 314 |
+
|
| 315 |
+
### Production Deployment
|
| 316 |
+
This Space is running with the real PLONK model using:
|
| 317 |
+
- **Model:** nicolas-dufour/PLONK_YFCC
|
| 318 |
+
- **Dataset:** YFCC-100M
|
| 319 |
+
- **Inference:** CFG=2.0, 32 samples, 32 timesteps for high quality predictions
|
| 320 |
+
- **Uncertainty:** Statistical analysis across 32 predictions for reliability estimation
|
| 321 |
+
|
| 322 |
+
### Available Models
|
| 323 |
+
- `nicolas-dufour/PLONK_YFCC` - YFCC-100M dataset
|
| 324 |
+
- `nicolas-dufour/PLONK_iNaturalist` - iNaturalist dataset
|
| 325 |
+
- `nicolas-dufour/PLONK_OSV_5M` - OpenStreetView-5M dataset
|
| 326 |
+
""")
|
| 327 |
+
|
| 328 |
+
# Event handlers
|
| 329 |
+
predict_btn.click(
|
| 330 |
+
fn=predict_location,
|
| 331 |
+
inputs=[image_input],
|
| 332 |
+
outputs=[output_text],
|
| 333 |
+
api_name="predict" # This enables API access at /api/predict
|
| 334 |
+
)
|
| 335 |
+
|
| 336 |
+
# Hidden API function for JSON responses
|
| 337 |
+
predict_json = gr.Interface(
|
| 338 |
+
fn=predict_location_json,
|
| 339 |
+
inputs=gr.Image(type="pil"),
|
| 340 |
+
outputs=gr.JSON(),
|
| 341 |
+
api_name="predict_json" # Available at /api/predict_json
|
| 342 |
+
)
|
| 343 |
+
|
| 344 |
+
# Add examples if available
|
| 345 |
+
try:
|
| 346 |
+
examples = [
|
| 347 |
+
["demo/examples/condor.jpg"],
|
| 348 |
+
["demo/examples/Kilimanjaro.jpg"],
|
| 349 |
+
["demo/examples/pigeon.png"]
|
| 350 |
+
]
|
| 351 |
+
gr.Examples(
|
| 352 |
+
examples=examples,
|
| 353 |
+
inputs=image_input,
|
| 354 |
+
outputs=output_text,
|
| 355 |
+
fn=predict_location,
|
| 356 |
+
cache_examples=True
|
| 357 |
+
)
|
| 358 |
+
except:
|
| 359 |
+
pass # Examples not available, skip
|
| 360 |
|
| 361 |
if __name__ == "__main__":
|
| 362 |
+
# For local testing
|
| 363 |
+
demo.launch(
|
| 364 |
+
server_name="0.0.0.0",
|
| 365 |
+
server_port=7860,
|
| 366 |
+
show_api=True
|
| 367 |
+
)
|
requirements_hf_spaces.txt
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
gradio>=4.0.0
|
| 2 |
+
pillow>=8.0.0
|
| 3 |
+
numpy>=1.21.0
|
| 4 |
+
torch>=1.9.0
|
| 5 |
+
torchvision>=0.10.0
|
| 6 |
+
transformers>=4.20.0
|
| 7 |
+
accelerate>=0.20.0
|
| 8 |
+
diffusers>=0.21.0
|
| 9 |
+
einops>=0.6.0
|
| 10 |
+
scipy>=1.7.0
|
| 11 |
+
scikit-learn>=1.0.0
|
| 12 |
+
torchdiffeq
|
| 13 |
+
diff-plonk
|