Spaces:
Running
on
T4
Running
on
T4
| title: "KYB YOLO-E" | |
| emoji: "π" | |
| colorFrom: "blue" | |
| colorTo: "purple" | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| license: "other" | |
| short_description: "YOLO-E European document detection with quality metrics" | |
| # π HF YOLO-E European Document Detection | |
| **Enhanced Hugging Face Space for European Identity Document Detection** | |
| This Hugging Face Space provides a production-ready API for detecting and classifying European identity documents (passports, driver's licenses, identity cards) with advanced ML-based orientation detection and video processing capabilities. | |
| ## π Table of Contents | |
| - [β¨ Features](#-features) | |
| - [π― European Document Detection](#-european-document-detection) | |
| - [π₯ Video Processing](#-video-processing) | |
| - [π§ Technical Capabilities](#-technical-capabilities) | |
| - [π Quick Start](#-quick-start) | |
| - [Image Detection](#image-detection) | |
| - [Video Detection](#video-detection) | |
| - [π API Endpoints](#-api-endpoints) | |
| - [POST `/v1/id/detect`](#post-v1iddetect) | |
| - [POST `/v1/id/detect-video`](#post-v1iddetect-video) | |
| - [GET `/health`](#get-health) | |
| - [π― Document Types Supported](#-document-types-supported) | |
| - [π Orientation Classification](#-orientation-classification) | |
| - [π Quality Metrics](#-quality-metrics) | |
| - [β‘ Performance](#-performance) | |
| - [π οΈ Configuration](#οΈ-configuration) | |
| - [Class Mapping](#class-mapping) | |
| - [Model Weights](#model-weights) | |
| - [π§ Deployment](#-deployment) | |
| - [Hugging Face Spaces](#hugging-face-spaces) | |
| - [GPU Docker Runtime](#gpu-docker-runtime) | |
| - [Local Development](#local-development) | |
| - [π Example Usage](#-example-usage) | |
| - [Python Client](#python-client) | |
| - [JavaScript Client](#javascript-client) | |
| - [π¨ Error Handling](#-error-handling) | |
| - [π― Test Results Summary](#-test-results-summary) | |
| - [π Security & Privacy](#-security--privacy) | |
| - [π Monitoring](#-monitoring) | |
| - [π Future Enhancements](#-future-enhancements) | |
| ## β¨ Features | |
| ### π― European Document Detection | |
| - **Document Types**: Identity cards, passports, driver's licenses, residence permits | |
| - **Front/Back Classification**: ML-based orientation detection using multiple methods | |
| - **Precise Coordinates**: Accurate bounding box coordinates for all detections | |
| - **Quality Assessment**: Comprehensive quality metrics (sharpness, glare, coverage, brightness, contrast) | |
| ### π₯ Video Processing | |
| - **Frame Extraction**: Intelligent frame sampling at configurable FPS | |
| - **Quality-Based Selection**: Automatic selection of best quality frames | |
| - **Multi-Frame Analysis**: Track documents across video frames | |
| - **Performance Optimized**: Efficient processing for real-time applications | |
| ### π§ Technical Capabilities | |
| - **YOLO-E Integration**: Latest Ultralytics YOLO-E for object detection | |
| - **ML-Based Classification**: Advanced orientation detection using multiple algorithms | |
| - **European Focus**: Optimized for European document standards and formats | |
| - **API Compatible**: RESTful API with standardized response format | |
| ## π Quick Start | |
| ### Health Check | |
| ```bash | |
| curl https://algoryn-yolo-e-idcard.hf.space/health | |
| ``` | |
| ### Image Detection | |
| ```bash | |
| curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect" \ | |
| -F "file=@your_image.jpg" \ | |
| -F "min_confidence=0.5" \ | |
| -F "return_crops=false" | |
| ``` | |
| ### Video Detection | |
| ```bash | |
| curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect-video" \ | |
| -F "file=@your_video.mp4" \ | |
| -F "min_confidence=0.5" \ | |
| -F "sample_fps=2.0" \ | |
| -F "max_detections=10" \ | |
| -F "return_crops=false" | |
| ``` | |
| ## π API Endpoints | |
| ### POST `/v1/id/detect` | |
| Detect European identity documents in uploaded images. | |
| **Parameters:** | |
| - `file` (required): Image file (JPEG, PNG, etc.) | |
| - `min_confidence` (optional): Minimum confidence threshold (0.0-1.0, default: 0.25) | |
| - `return_crops` (optional): Return cropped document images (default: false) | |
| **Response:** | |
| ```json | |
| { | |
| "request_id": "uuid", | |
| "media_type": "image", | |
| "processing_time": 1.23, | |
| "detections": [ | |
| { | |
| "document_type": "identity_card", | |
| "orientation": "front", | |
| "confidence": 0.95, | |
| "bounding_box": { | |
| "x1": 0.1, "y1": 0.2, "x2": 0.8, "y2": 0.9 | |
| }, | |
| "quality": { | |
| "sharpness": 0.85, | |
| "glare_score": 0.1, | |
| "coverage": 0.75, | |
| "brightness": 0.6, | |
| "contrast": 0.7 | |
| }, | |
| "tracking": { | |
| "track_id": null, | |
| "is_tracked": false | |
| }, | |
| "metadata": { | |
| "class_name": "id_front", | |
| "original_coordinates": [100, 200, 800, 900], | |
| "mask_used": false | |
| } | |
| } | |
| ] | |
| } | |
| ``` | |
| ### POST `/v1/id/detect-video` | |
| Detect European identity documents in uploaded videos with quality-based frame selection. | |
| **Parameters:** | |
| - `file` (required): Video file (MP4, AVI, etc.) | |
| - `min_confidence` (optional): Minimum confidence threshold (0.0-1.0, default: 0.25) | |
| - `sample_fps` (optional): Video sampling rate (0.1-30.0, default: 2.0) | |
| - `return_crops` (optional): Return cropped document images (default: false) | |
| - `max_detections` (optional): Maximum detections to return (1-100, default: 10) | |
| **Response:** | |
| ```json | |
| { | |
| "request_id": "uuid", | |
| "media_type": "video", | |
| "processing_time": 3.45, | |
| "frame_count": 24, | |
| "detections": [ | |
| // Same structure as image detection | |
| ] | |
| } | |
| ``` | |
| ### GET `/health` | |
| Health check endpoint. | |
| **Response:** | |
| ```json | |
| { | |
| "status": "healthy", | |
| "version": "2.0.0" | |
| } | |
| ``` | |
| ## π― Document Types Supported | |
| | Type | Description | Front/Back Detection | | |
| |------|-------------|---------------------| | |
| | `identity_card` | European identity cards | β | | |
| | `passport` | Passports | β | | |
| | `driver_license` | Driver's licenses | β | | |
| | `residence_permit` | Residence permits | β | | |
| ## π Orientation Classification | |
| The system uses multiple methods for reliable front/back classification: | |
| 1. **Class-Based**: Uses detected class (id_front, id_back, etc.) | |
| 2. **Portrait Detection**: Detects faces/portraits using YOLO-E | |
| 3. **Heuristic Analysis**: Text density, symmetry, and edge pattern analysis | |
| ## π Quality Metrics | |
| Each detection includes comprehensive quality assessment: | |
| - **Sharpness**: Image clarity using Laplacian variance | |
| - **Glare Score**: Bright pixel concentration analysis | |
| - **Coverage**: Document area coverage within bounding box | |
| - **Brightness**: Overall image brightness | |
| - **Contrast**: Image contrast using standard deviation | |
| ## β‘ Performance | |
| | Metric | Target | Notes | | |
| |--------|--------|-------| | |
| | Image Processing | <1.5s | Single image detection | | |
| | Video Processing | <3.0s | Frame extraction and selection | | |
| | Memory Usage | <3GB | YOLO-E + orientation classifier | | |
| | Reliability | 99.5% | With fallback mechanisms | | |
| ## π οΈ Configuration | |
| ### Class Mapping | |
| The system uses `config/labels.json` for class mapping: | |
| ```json | |
| { | |
| "classes": { | |
| "0": "id_front", | |
| "1": "id_back", | |
| "2": "driver_license", | |
| "3": "passport", | |
| "4": "mrz" | |
| } | |
| } | |
| ``` | |
| ### Model Weights | |
| - **YOLO-E**: `yolo11n.pt` (nano variant for faster inference) | |
| - **Orientation Classifier**: Integrated ML-based classification | |
| ## π§ Deployment | |
| ### Hugging Face Spaces | |
| 1. Upload the code to a new Hugging Face Space | |
| 2. Set the hardware to GPU for optimal performance | |
| 3. Configure environment variables if needed | |
| 4. Deploy and test the endpoints | |
| ### GPU Docker Runtime | |
| - Ensure host has recent NVIDIA driver installed | |
| - Install NVIDIA Container Toolkit on the host | |
| - Run the container with GPU access enabled: | |
| ```bash | |
| # Build image | |
| docker build -t kybtech-yolo-e-idcard:gpu . | |
| # Run with all GPUs and necessary capabilities | |
| docker run --rm \ | |
| --gpus all \ | |
| --ipc=host \ | |
| -p 7860:7860 \ | |
| kybtech-yolo-e-idcard:gpu | |
| ``` | |
| Notes: | |
| - The Dockerfile uses `pytorch/pytorch:2.7.0-cuda12.6-cudnn9-runtime` as base (CUDA included). | |
| - The app auto-selects GPU if available and performs a warm-up pass. | |
| - Verify GPU is visible inside the container with `python -c "import torch; print(torch.cuda.is_available())"`. | |
| ### Local Development | |
| ```bash | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Run the application | |
| python app.py | |
| ``` | |
| ## π Example Usage | |
| ### Python Client | |
| ```python | |
| import requests | |
| # Image detection | |
| with open('document.jpg', 'rb') as f: | |
| response = requests.post( | |
| 'https://algoryn-yolo-e-idcard.hf.space/v1/id/detect', | |
| files={'file': f}, | |
| data={'min_confidence': 0.5} | |
| ) | |
| result = response.json() | |
| for detection in result['detections']: | |
| print(f"Found {detection['document_type']} ({detection['orientation']})") | |
| print(f"Confidence: {detection['confidence']:.2f}") | |
| print(f"Quality: {detection['quality']['sharpness']:.2f}") | |
| ``` | |
| ### JavaScript Client | |
| ```javascript | |
| const formData = new FormData(); | |
| formData.append('file', fileInput.files[0]); | |
| formData.append('min_confidence', '0.5'); | |
| fetch('https://algoryn-yolo-e-idcard.hf.space/v1/id/detect', { | |
| method: 'POST', | |
| body: formData | |
| }) | |
| .then(response => response.json()) | |
| .then(data => { | |
| data.detections.forEach(detection => { | |
| console.log(`Found ${detection.document_type} (${detection.orientation})`); | |
| }); | |
| }); | |
| ``` | |
| ## π¨ Error Handling | |
| The API returns appropriate HTTP status codes: | |
| - `200`: Success | |
| - `400`: Bad request (invalid parameters) | |
| - `503`: Service unavailable (models not loaded) | |
| - `500`: Internal server error | |
| Error responses include detailed error messages: | |
| ```json | |
| { | |
| "detail": "Detection failed: Invalid image format" | |
| } | |
| ``` | |
| ## π― Test Results Summary | |
| - β Health Check: Space is healthy and running version 2.0.0 | |
| - β Image Detection: Successfully detected identity cards in test images | |
| - β Video Detection: Processed 8 frames and found 10 detections with tracking | |
| - β Performance: ~1.1s for images, ~3.3s for videos | |
| - β Quality Metrics: Comprehensive quality assessment working | |
| ## π Security & Privacy | |
| - **No Data Storage**: Images/videos are processed in memory only | |
| - **Temporary Files**: Video processing uses temporary files that are immediately cleaned up | |
| - **No Logging**: Sensitive document data is not logged | |
| - **API Authentication**: Configure authentication as needed for your deployment | |
| ## π Monitoring | |
| Monitor the service using: | |
| - **Health Check**: `/health` endpoint for service status | |
| - **Processing Time**: Included in all responses | |
| - **Error Rates**: Monitor HTTP status codes | |
| - **Performance**: Track response times and memory usage | |
| ## π Future Enhancements | |
| - **Real-time Processing**: Optimize for live video streams | |
| - **Multi-country Support**: Expand beyond European documents | |
| - **Advanced Tracking**: Implement more sophisticated video tracking | |
| - **Custom Models**: Support for custom document types | |
| --- | |
| *This enhanced HF YOLO-E deployment provides production-ready European document detection with advanced ML capabilities and video processing support.* | |