yolo-e-idcard / README.md
tommulder's picture
Remove duplicate 'Accessing the API' section and consolidate into Quick Start
3ebae8a
---
title: "KYB YOLO-E"
emoji: "πŸ”"
colorFrom: "blue"
colorTo: "purple"
sdk: docker
app_port: 7860
pinned: false
license: "other"
short_description: "YOLO-E European document detection with quality metrics"
---
# πŸš€ HF YOLO-E European Document Detection
**Enhanced Hugging Face Space for European Identity Document Detection**
This Hugging Face Space provides a production-ready API for detecting and classifying European identity documents (passports, driver's licenses, identity cards) with advanced ML-based orientation detection and video processing capabilities.
## πŸ“‹ Table of Contents
- [✨ Features](#-features)
- [🎯 European Document Detection](#-european-document-detection)
- [πŸŽ₯ Video Processing](#-video-processing)
- [πŸ”§ Technical Capabilities](#-technical-capabilities)
- [πŸš€ Quick Start](#-quick-start)
- [Image Detection](#image-detection)
- [Video Detection](#video-detection)
- [πŸ“Š API Endpoints](#-api-endpoints)
- [POST `/v1/id/detect`](#post-v1iddetect)
- [POST `/v1/id/detect-video`](#post-v1iddetect-video)
- [GET `/health`](#get-health)
- [🎯 Document Types Supported](#-document-types-supported)
- [πŸ” Orientation Classification](#-orientation-classification)
- [πŸ“ˆ Quality Metrics](#-quality-metrics)
- [⚑ Performance](#-performance)
- [πŸ› οΈ Configuration](#️-configuration)
- [Class Mapping](#class-mapping)
- [Model Weights](#model-weights)
- [πŸ”§ Deployment](#-deployment)
- [Hugging Face Spaces](#hugging-face-spaces)
- [GPU Docker Runtime](#gpu-docker-runtime)
- [Local Development](#local-development)
- [πŸ“ Example Usage](#-example-usage)
- [Python Client](#python-client)
- [JavaScript Client](#javascript-client)
- [🚨 Error Handling](#-error-handling)
- [🎯 Test Results Summary](#-test-results-summary)
- [πŸ”’ Security & Privacy](#-security--privacy)
- [πŸ“Š Monitoring](#-monitoring)
- [πŸŽ‰ Future Enhancements](#-future-enhancements)
## ✨ Features
### 🎯 European Document Detection
- **Document Types**: Identity cards, passports, driver's licenses, residence permits
- **Front/Back Classification**: ML-based orientation detection using multiple methods
- **Precise Coordinates**: Accurate bounding box coordinates for all detections
- **Quality Assessment**: Comprehensive quality metrics (sharpness, glare, coverage, brightness, contrast)
### πŸŽ₯ Video Processing
- **Frame Extraction**: Intelligent frame sampling at configurable FPS
- **Quality-Based Selection**: Automatic selection of best quality frames
- **Multi-Frame Analysis**: Track documents across video frames
- **Performance Optimized**: Efficient processing for real-time applications
### πŸ”§ Technical Capabilities
- **YOLO-E Integration**: Latest Ultralytics YOLO-E for object detection
- **ML-Based Classification**: Advanced orientation detection using multiple algorithms
- **European Focus**: Optimized for European document standards and formats
- **API Compatible**: RESTful API with standardized response format
## πŸš€ Quick Start
### Health Check
```bash
curl https://algoryn-yolo-e-idcard.hf.space/health
```
### Image Detection
```bash
curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect" \
-F "file=@your_image.jpg" \
-F "min_confidence=0.5" \
-F "return_crops=false"
```
### Video Detection
```bash
curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect-video" \
-F "file=@your_video.mp4" \
-F "min_confidence=0.5" \
-F "sample_fps=2.0" \
-F "max_detections=10" \
-F "return_crops=false"
```
## πŸ“Š API Endpoints
### POST `/v1/id/detect`
Detect European identity documents in uploaded images.
**Parameters:**
- `file` (required): Image file (JPEG, PNG, etc.)
- `min_confidence` (optional): Minimum confidence threshold (0.0-1.0, default: 0.25)
- `return_crops` (optional): Return cropped document images (default: false)
**Response:**
```json
{
"request_id": "uuid",
"media_type": "image",
"processing_time": 1.23,
"detections": [
{
"document_type": "identity_card",
"orientation": "front",
"confidence": 0.95,
"bounding_box": {
"x1": 0.1, "y1": 0.2, "x2": 0.8, "y2": 0.9
},
"quality": {
"sharpness": 0.85,
"glare_score": 0.1,
"coverage": 0.75,
"brightness": 0.6,
"contrast": 0.7
},
"tracking": {
"track_id": null,
"is_tracked": false
},
"metadata": {
"class_name": "id_front",
"original_coordinates": [100, 200, 800, 900],
"mask_used": false
}
}
]
}
```
### POST `/v1/id/detect-video`
Detect European identity documents in uploaded videos with quality-based frame selection.
**Parameters:**
- `file` (required): Video file (MP4, AVI, etc.)
- `min_confidence` (optional): Minimum confidence threshold (0.0-1.0, default: 0.25)
- `sample_fps` (optional): Video sampling rate (0.1-30.0, default: 2.0)
- `return_crops` (optional): Return cropped document images (default: false)
- `max_detections` (optional): Maximum detections to return (1-100, default: 10)
**Response:**
```json
{
"request_id": "uuid",
"media_type": "video",
"processing_time": 3.45,
"frame_count": 24,
"detections": [
// Same structure as image detection
]
}
```
### GET `/health`
Health check endpoint.
**Response:**
```json
{
"status": "healthy",
"version": "2.0.0"
}
```
## 🎯 Document Types Supported
| Type | Description | Front/Back Detection |
|------|-------------|---------------------|
| `identity_card` | European identity cards | βœ… |
| `passport` | Passports | βœ… |
| `driver_license` | Driver's licenses | βœ… |
| `residence_permit` | Residence permits | βœ… |
## πŸ” Orientation Classification
The system uses multiple methods for reliable front/back classification:
1. **Class-Based**: Uses detected class (id_front, id_back, etc.)
2. **Portrait Detection**: Detects faces/portraits using YOLO-E
3. **Heuristic Analysis**: Text density, symmetry, and edge pattern analysis
## πŸ“ˆ Quality Metrics
Each detection includes comprehensive quality assessment:
- **Sharpness**: Image clarity using Laplacian variance
- **Glare Score**: Bright pixel concentration analysis
- **Coverage**: Document area coverage within bounding box
- **Brightness**: Overall image brightness
- **Contrast**: Image contrast using standard deviation
## ⚑ Performance
| Metric | Target | Notes |
|--------|--------|-------|
| Image Processing | <1.5s | Single image detection |
| Video Processing | <3.0s | Frame extraction and selection |
| Memory Usage | <3GB | YOLO-E + orientation classifier |
| Reliability | 99.5% | With fallback mechanisms |
## πŸ› οΈ Configuration
### Class Mapping
The system uses `config/labels.json` for class mapping:
```json
{
"classes": {
"0": "id_front",
"1": "id_back",
"2": "driver_license",
"3": "passport",
"4": "mrz"
}
}
```
### Model Weights
- **YOLO-E**: `yolo11n.pt` (nano variant for faster inference)
- **Orientation Classifier**: Integrated ML-based classification
## πŸ”§ Deployment
### Hugging Face Spaces
1. Upload the code to a new Hugging Face Space
2. Set the hardware to GPU for optimal performance
3. Configure environment variables if needed
4. Deploy and test the endpoints
### GPU Docker Runtime
- Ensure host has recent NVIDIA driver installed
- Install NVIDIA Container Toolkit on the host
- Run the container with GPU access enabled:
```bash
# Build image
docker build -t kybtech-yolo-e-idcard:gpu .
# Run with all GPUs and necessary capabilities
docker run --rm \
--gpus all \
--ipc=host \
-p 7860:7860 \
kybtech-yolo-e-idcard:gpu
```
Notes:
- The Dockerfile uses `pytorch/pytorch:2.7.0-cuda12.6-cudnn9-runtime` as base (CUDA included).
- The app auto-selects GPU if available and performs a warm-up pass.
- Verify GPU is visible inside the container with `python -c "import torch; print(torch.cuda.is_available())"`.
### Local Development
```bash
# Install dependencies
pip install -r requirements.txt
# Run the application
python app.py
```
## πŸ“ Example Usage
### Python Client
```python
import requests
# Image detection
with open('document.jpg', 'rb') as f:
response = requests.post(
'https://algoryn-yolo-e-idcard.hf.space/v1/id/detect',
files={'file': f},
data={'min_confidence': 0.5}
)
result = response.json()
for detection in result['detections']:
print(f"Found {detection['document_type']} ({detection['orientation']})")
print(f"Confidence: {detection['confidence']:.2f}")
print(f"Quality: {detection['quality']['sharpness']:.2f}")
```
### JavaScript Client
```javascript
const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('min_confidence', '0.5');
fetch('https://algoryn-yolo-e-idcard.hf.space/v1/id/detect', {
method: 'POST',
body: formData
})
.then(response => response.json())
.then(data => {
data.detections.forEach(detection => {
console.log(`Found ${detection.document_type} (${detection.orientation})`);
});
});
```
## 🚨 Error Handling
The API returns appropriate HTTP status codes:
- `200`: Success
- `400`: Bad request (invalid parameters)
- `503`: Service unavailable (models not loaded)
- `500`: Internal server error
Error responses include detailed error messages:
```json
{
"detail": "Detection failed: Invalid image format"
}
```
## 🎯 Test Results Summary
- βœ… Health Check: Space is healthy and running version 2.0.0
- βœ… Image Detection: Successfully detected identity cards in test images
- βœ… Video Detection: Processed 8 frames and found 10 detections with tracking
- βœ… Performance: ~1.1s for images, ~3.3s for videos
- βœ… Quality Metrics: Comprehensive quality assessment working
## πŸ”’ Security & Privacy
- **No Data Storage**: Images/videos are processed in memory only
- **Temporary Files**: Video processing uses temporary files that are immediately cleaned up
- **No Logging**: Sensitive document data is not logged
- **API Authentication**: Configure authentication as needed for your deployment
## πŸ“Š Monitoring
Monitor the service using:
- **Health Check**: `/health` endpoint for service status
- **Processing Time**: Included in all responses
- **Error Rates**: Monitor HTTP status codes
- **Performance**: Track response times and memory usage
## πŸŽ‰ Future Enhancements
- **Real-time Processing**: Optimize for live video streams
- **Multi-country Support**: Expand beyond European documents
- **Advanced Tracking**: Implement more sophisticated video tracking
- **Custom Models**: Support for custom document types
---
*This enhanced HF YOLO-E deployment provides production-ready European document detection with advanced ML capabilities and video processing support.*