yolo-e-idcard / README.md
tommulder's picture
Remove duplicate 'Accessing the API' section and consolidate into Quick Start
3ebae8a
metadata
title: KYB YOLO-E
emoji: πŸ”
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: other
short_description: YOLO-E European document detection with quality metrics

πŸš€ HF YOLO-E European Document Detection

Enhanced Hugging Face Space for European Identity Document Detection

This Hugging Face Space provides a production-ready API for detecting and classifying European identity documents (passports, driver's licenses, identity cards) with advanced ML-based orientation detection and video processing capabilities.

πŸ“‹ Table of Contents

✨ Features

🎯 European Document Detection

  • Document Types: Identity cards, passports, driver's licenses, residence permits
  • Front/Back Classification: ML-based orientation detection using multiple methods
  • Precise Coordinates: Accurate bounding box coordinates for all detections
  • Quality Assessment: Comprehensive quality metrics (sharpness, glare, coverage, brightness, contrast)

πŸŽ₯ Video Processing

  • Frame Extraction: Intelligent frame sampling at configurable FPS
  • Quality-Based Selection: Automatic selection of best quality frames
  • Multi-Frame Analysis: Track documents across video frames
  • Performance Optimized: Efficient processing for real-time applications

πŸ”§ Technical Capabilities

  • YOLO-E Integration: Latest Ultralytics YOLO-E for object detection
  • ML-Based Classification: Advanced orientation detection using multiple algorithms
  • European Focus: Optimized for European document standards and formats
  • API Compatible: RESTful API with standardized response format

πŸš€ Quick Start

Health Check

curl https://algoryn-yolo-e-idcard.hf.space/health

Image Detection

curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect" \
  -F "file=@your_image.jpg" \
  -F "min_confidence=0.5" \
  -F "return_crops=false"

Video Detection

curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect-video" \
  -F "file=@your_video.mp4" \
  -F "min_confidence=0.5" \
  -F "sample_fps=2.0" \
  -F "max_detections=10" \
  -F "return_crops=false"

πŸ“Š API Endpoints

POST /v1/id/detect

Detect European identity documents in uploaded images.

Parameters:

  • file (required): Image file (JPEG, PNG, etc.)
  • min_confidence (optional): Minimum confidence threshold (0.0-1.0, default: 0.25)
  • return_crops (optional): Return cropped document images (default: false)

Response:

{
  "request_id": "uuid",
  "media_type": "image",
  "processing_time": 1.23,
  "detections": [
    {
      "document_type": "identity_card",
      "orientation": "front",
      "confidence": 0.95,
      "bounding_box": {
        "x1": 0.1, "y1": 0.2, "x2": 0.8, "y2": 0.9
      },
      "quality": {
        "sharpness": 0.85,
        "glare_score": 0.1,
        "coverage": 0.75,
        "brightness": 0.6,
        "contrast": 0.7
      },
      "tracking": {
        "track_id": null,
        "is_tracked": false
      },
      "metadata": {
        "class_name": "id_front",
        "original_coordinates": [100, 200, 800, 900],
        "mask_used": false
      }
    }
  ]
}

POST /v1/id/detect-video

Detect European identity documents in uploaded videos with quality-based frame selection.

Parameters:

  • file (required): Video file (MP4, AVI, etc.)
  • min_confidence (optional): Minimum confidence threshold (0.0-1.0, default: 0.25)
  • sample_fps (optional): Video sampling rate (0.1-30.0, default: 2.0)
  • return_crops (optional): Return cropped document images (default: false)
  • max_detections (optional): Maximum detections to return (1-100, default: 10)

Response:

{
  "request_id": "uuid",
  "media_type": "video",
  "processing_time": 3.45,
  "frame_count": 24,
  "detections": [
    // Same structure as image detection
  ]
}

GET /health

Health check endpoint.

Response:

{
  "status": "healthy",
  "version": "2.0.0"
}

🎯 Document Types Supported

Type Description Front/Back Detection
identity_card European identity cards βœ…
passport Passports βœ…
driver_license Driver's licenses βœ…
residence_permit Residence permits βœ…

πŸ” Orientation Classification

The system uses multiple methods for reliable front/back classification:

  1. Class-Based: Uses detected class (id_front, id_back, etc.)
  2. Portrait Detection: Detects faces/portraits using YOLO-E
  3. Heuristic Analysis: Text density, symmetry, and edge pattern analysis

πŸ“ˆ Quality Metrics

Each detection includes comprehensive quality assessment:

  • Sharpness: Image clarity using Laplacian variance
  • Glare Score: Bright pixel concentration analysis
  • Coverage: Document area coverage within bounding box
  • Brightness: Overall image brightness
  • Contrast: Image contrast using standard deviation

⚑ Performance

Metric Target Notes
Image Processing <1.5s Single image detection
Video Processing <3.0s Frame extraction and selection
Memory Usage <3GB YOLO-E + orientation classifier
Reliability 99.5% With fallback mechanisms

πŸ› οΈ Configuration

Class Mapping

The system uses config/labels.json for class mapping:

{
  "classes": {
    "0": "id_front",
    "1": "id_back", 
    "2": "driver_license",
    "3": "passport",
    "4": "mrz"
  }
}

Model Weights

  • YOLO-E: yolo11n.pt (nano variant for faster inference)
  • Orientation Classifier: Integrated ML-based classification

πŸ”§ Deployment

Hugging Face Spaces

  1. Upload the code to a new Hugging Face Space
  2. Set the hardware to GPU for optimal performance
  3. Configure environment variables if needed
  4. Deploy and test the endpoints

GPU Docker Runtime

  • Ensure host has recent NVIDIA driver installed
  • Install NVIDIA Container Toolkit on the host
  • Run the container with GPU access enabled:
# Build image
docker build -t kybtech-yolo-e-idcard:gpu .

# Run with all GPUs and necessary capabilities
docker run --rm \
  --gpus all \
  --ipc=host \
  -p 7860:7860 \
  kybtech-yolo-e-idcard:gpu

Notes:

  • The Dockerfile uses pytorch/pytorch:2.7.0-cuda12.6-cudnn9-runtime as base (CUDA included).
  • The app auto-selects GPU if available and performs a warm-up pass.
  • Verify GPU is visible inside the container with python -c "import torch; print(torch.cuda.is_available())".

Local Development

# Install dependencies
pip install -r requirements.txt

# Run the application
python app.py

πŸ“ Example Usage

Python Client

import requests

# Image detection
with open('document.jpg', 'rb') as f:
    response = requests.post(
        'https://algoryn-yolo-e-idcard.hf.space/v1/id/detect',
        files={'file': f},
        data={'min_confidence': 0.5}
    )

result = response.json()
for detection in result['detections']:
    print(f"Found {detection['document_type']} ({detection['orientation']})")
    print(f"Confidence: {detection['confidence']:.2f}")
    print(f"Quality: {detection['quality']['sharpness']:.2f}")

JavaScript Client

const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('min_confidence', '0.5');

fetch('https://algoryn-yolo-e-idcard.hf.space/v1/id/detect', {
  method: 'POST',
  body: formData
})
.then(response => response.json())
.then(data => {
  data.detections.forEach(detection => {
    console.log(`Found ${detection.document_type} (${detection.orientation})`);
  });
});

🚨 Error Handling

The API returns appropriate HTTP status codes:

  • 200: Success
  • 400: Bad request (invalid parameters)
  • 503: Service unavailable (models not loaded)
  • 500: Internal server error

Error responses include detailed error messages:

{
  "detail": "Detection failed: Invalid image format"
}

🎯 Test Results Summary

  • βœ… Health Check: Space is healthy and running version 2.0.0
  • βœ… Image Detection: Successfully detected identity cards in test images
  • βœ… Video Detection: Processed 8 frames and found 10 detections with tracking
  • βœ… Performance: ~1.1s for images, ~3.3s for videos
  • βœ… Quality Metrics: Comprehensive quality assessment working

πŸ”’ Security & Privacy

  • No Data Storage: Images/videos are processed in memory only
  • Temporary Files: Video processing uses temporary files that are immediately cleaned up
  • No Logging: Sensitive document data is not logged
  • API Authentication: Configure authentication as needed for your deployment

πŸ“Š Monitoring

Monitor the service using:

  • Health Check: /health endpoint for service status
  • Processing Time: Included in all responses
  • Error Rates: Monitor HTTP status codes
  • Performance: Track response times and memory usage

πŸŽ‰ Future Enhancements

  • Real-time Processing: Optimize for live video streams
  • Multi-country Support: Expand beyond European documents
  • Advanced Tracking: Implement more sophisticated video tracking
  • Custom Models: Support for custom document types

This enhanced HF YOLO-E deployment provides production-ready European document detection with advanced ML capabilities and video processing support.