Spaces:
Sleeping
title: KYB YOLO-E
emoji: π
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: other
short_description: YOLO-E European document detection with quality metrics
π HF YOLO-E European Document Detection
Enhanced Hugging Face Space for European Identity Document Detection
This Hugging Face Space provides a production-ready API for detecting and classifying European identity documents (passports, driver's licenses, identity cards) with advanced ML-based orientation detection and video processing capabilities.
π Table of Contents
- β¨ Features
- π Quick Start
- π API Endpoints
- π― Document Types Supported
- π Orientation Classification
- π Quality Metrics
- β‘ Performance
- π οΈ Configuration
- π§ Deployment
- π Example Usage
- π¨ Error Handling
- π― Test Results Summary
- π Security & Privacy
- π Monitoring
- π Future Enhancements
β¨ Features
π― European Document Detection
- Document Types: Identity cards, passports, driver's licenses, residence permits
- Front/Back Classification: ML-based orientation detection using multiple methods
- Precise Coordinates: Accurate bounding box coordinates for all detections
- Quality Assessment: Comprehensive quality metrics (sharpness, glare, coverage, brightness, contrast)
π₯ Video Processing
- Frame Extraction: Intelligent frame sampling at configurable FPS
- Quality-Based Selection: Automatic selection of best quality frames
- Multi-Frame Analysis: Track documents across video frames
- Performance Optimized: Efficient processing for real-time applications
π§ Technical Capabilities
- YOLO-E Integration: Latest Ultralytics YOLO-E for object detection
- ML-Based Classification: Advanced orientation detection using multiple algorithms
- European Focus: Optimized for European document standards and formats
- API Compatible: RESTful API with standardized response format
π Quick Start
Health Check
curl https://algoryn-yolo-e-idcard.hf.space/health
Image Detection
curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect" \
-F "file=@your_image.jpg" \
-F "min_confidence=0.5" \
-F "return_crops=false"
Video Detection
curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect-video" \
-F "file=@your_video.mp4" \
-F "min_confidence=0.5" \
-F "sample_fps=2.0" \
-F "max_detections=10" \
-F "return_crops=false"
π API Endpoints
POST /v1/id/detect
Detect European identity documents in uploaded images.
Parameters:
file(required): Image file (JPEG, PNG, etc.)min_confidence(optional): Minimum confidence threshold (0.0-1.0, default: 0.25)return_crops(optional): Return cropped document images (default: false)
Response:
{
"request_id": "uuid",
"media_type": "image",
"processing_time": 1.23,
"detections": [
{
"document_type": "identity_card",
"orientation": "front",
"confidence": 0.95,
"bounding_box": {
"x1": 0.1, "y1": 0.2, "x2": 0.8, "y2": 0.9
},
"quality": {
"sharpness": 0.85,
"glare_score": 0.1,
"coverage": 0.75,
"brightness": 0.6,
"contrast": 0.7
},
"tracking": {
"track_id": null,
"is_tracked": false
},
"metadata": {
"class_name": "id_front",
"original_coordinates": [100, 200, 800, 900],
"mask_used": false
}
}
]
}
POST /v1/id/detect-video
Detect European identity documents in uploaded videos with quality-based frame selection.
Parameters:
file(required): Video file (MP4, AVI, etc.)min_confidence(optional): Minimum confidence threshold (0.0-1.0, default: 0.25)sample_fps(optional): Video sampling rate (0.1-30.0, default: 2.0)return_crops(optional): Return cropped document images (default: false)max_detections(optional): Maximum detections to return (1-100, default: 10)
Response:
{
"request_id": "uuid",
"media_type": "video",
"processing_time": 3.45,
"frame_count": 24,
"detections": [
// Same structure as image detection
]
}
GET /health
Health check endpoint.
Response:
{
"status": "healthy",
"version": "2.0.0"
}
π― Document Types Supported
| Type | Description | Front/Back Detection |
|---|---|---|
identity_card |
European identity cards | β |
passport |
Passports | β |
driver_license |
Driver's licenses | β |
residence_permit |
Residence permits | β |
π Orientation Classification
The system uses multiple methods for reliable front/back classification:
- Class-Based: Uses detected class (id_front, id_back, etc.)
- Portrait Detection: Detects faces/portraits using YOLO-E
- Heuristic Analysis: Text density, symmetry, and edge pattern analysis
π Quality Metrics
Each detection includes comprehensive quality assessment:
- Sharpness: Image clarity using Laplacian variance
- Glare Score: Bright pixel concentration analysis
- Coverage: Document area coverage within bounding box
- Brightness: Overall image brightness
- Contrast: Image contrast using standard deviation
β‘ Performance
| Metric | Target | Notes |
|---|---|---|
| Image Processing | <1.5s | Single image detection |
| Video Processing | <3.0s | Frame extraction and selection |
| Memory Usage | <3GB | YOLO-E + orientation classifier |
| Reliability | 99.5% | With fallback mechanisms |
π οΈ Configuration
Class Mapping
The system uses config/labels.json for class mapping:
{
"classes": {
"0": "id_front",
"1": "id_back",
"2": "driver_license",
"3": "passport",
"4": "mrz"
}
}
Model Weights
- YOLO-E:
yolo11n.pt(nano variant for faster inference) - Orientation Classifier: Integrated ML-based classification
π§ Deployment
Hugging Face Spaces
- Upload the code to a new Hugging Face Space
- Set the hardware to GPU for optimal performance
- Configure environment variables if needed
- Deploy and test the endpoints
GPU Docker Runtime
- Ensure host has recent NVIDIA driver installed
- Install NVIDIA Container Toolkit on the host
- Run the container with GPU access enabled:
# Build image
docker build -t kybtech-yolo-e-idcard:gpu .
# Run with all GPUs and necessary capabilities
docker run --rm \
--gpus all \
--ipc=host \
-p 7860:7860 \
kybtech-yolo-e-idcard:gpu
Notes:
- The Dockerfile uses
pytorch/pytorch:2.7.0-cuda12.6-cudnn9-runtimeas base (CUDA included). - The app auto-selects GPU if available and performs a warm-up pass.
- Verify GPU is visible inside the container with
python -c "import torch; print(torch.cuda.is_available())".
Local Development
# Install dependencies
pip install -r requirements.txt
# Run the application
python app.py
π Example Usage
Python Client
import requests
# Image detection
with open('document.jpg', 'rb') as f:
response = requests.post(
'https://algoryn-yolo-e-idcard.hf.space/v1/id/detect',
files={'file': f},
data={'min_confidence': 0.5}
)
result = response.json()
for detection in result['detections']:
print(f"Found {detection['document_type']} ({detection['orientation']})")
print(f"Confidence: {detection['confidence']:.2f}")
print(f"Quality: {detection['quality']['sharpness']:.2f}")
JavaScript Client
const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('min_confidence', '0.5');
fetch('https://algoryn-yolo-e-idcard.hf.space/v1/id/detect', {
method: 'POST',
body: formData
})
.then(response => response.json())
.then(data => {
data.detections.forEach(detection => {
console.log(`Found ${detection.document_type} (${detection.orientation})`);
});
});
π¨ Error Handling
The API returns appropriate HTTP status codes:
200: Success400: Bad request (invalid parameters)503: Service unavailable (models not loaded)500: Internal server error
Error responses include detailed error messages:
{
"detail": "Detection failed: Invalid image format"
}
π― Test Results Summary
- β Health Check: Space is healthy and running version 2.0.0
- β Image Detection: Successfully detected identity cards in test images
- β Video Detection: Processed 8 frames and found 10 detections with tracking
- β Performance: ~1.1s for images, ~3.3s for videos
- β Quality Metrics: Comprehensive quality assessment working
π Security & Privacy
- No Data Storage: Images/videos are processed in memory only
- Temporary Files: Video processing uses temporary files that are immediately cleaned up
- No Logging: Sensitive document data is not logged
- API Authentication: Configure authentication as needed for your deployment
π Monitoring
Monitor the service using:
- Health Check:
/healthendpoint for service status - Processing Time: Included in all responses
- Error Rates: Monitor HTTP status codes
- Performance: Track response times and memory usage
π Future Enhancements
- Real-time Processing: Optimize for live video streams
- Multi-country Support: Expand beyond European documents
- Advanced Tracking: Implement more sophisticated video tracking
- Custom Models: Support for custom document types
This enhanced HF YOLO-E deployment provides production-ready European document detection with advanced ML capabilities and video processing support.