Spaces:

algoryn
/

yolo-e-idcard

Sleeping

App Files Files Community

yolo-e-idcard / README.md

tommulder

Remove duplicate 'Accessing the API' section and consolidate into Quick Start

3ebae8a 2 months ago

preview code

raw

history blame contribute delete

10.8 kB

metadata

title: KYB YOLO-E
emoji: 🔍
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: other
short_description: YOLO-E European document detection with quality metrics

🚀 HF YOLO-E European Document Detection

Enhanced Hugging Face Space for European Identity Document Detection

This Hugging Face Space provides a production-ready API for detecting and classifying European identity documents (passports, driver's licenses, identity cards) with advanced ML-based orientation detection and video processing capabilities.

📋 Table of Contents

✨ Features
🚀 Quick Start
- Image Detection
- Video Detection
📊 API Endpoints
🎯 Document Types Supported
🔍 Orientation Classification
📈 Quality Metrics
⚡ Performance
🛠️ Configuration
- Class Mapping
- Model Weights
🔧 Deployment
📝 Example Usage
- Python Client
- JavaScript Client
🚨 Error Handling
🎯 Test Results Summary
🔒 Security & Privacy
📊 Monitoring
🎉 Future Enhancements

✨ Features

🎯 European Document Detection

Document Types: Identity cards, passports, driver's licenses, residence permits
Front/Back Classification: ML-based orientation detection using multiple methods
Precise Coordinates: Accurate bounding box coordinates for all detections
Quality Assessment: Comprehensive quality metrics (sharpness, glare, coverage, brightness, contrast)

🎥 Video Processing

Frame Extraction: Intelligent frame sampling at configurable FPS
Quality-Based Selection: Automatic selection of best quality frames
Multi-Frame Analysis: Track documents across video frames
Performance Optimized: Efficient processing for real-time applications

🔧 Technical Capabilities

YOLO-E Integration: Latest Ultralytics YOLO-E for object detection
ML-Based Classification: Advanced orientation detection using multiple algorithms
European Focus: Optimized for European document standards and formats
API Compatible: RESTful API with standardized response format

🚀 Quick Start

Health Check

curl https://algoryn-yolo-e-idcard.hf.space/health

Image Detection

curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect" \
  -F "file=@your_image.jpg" \
  -F "min_confidence=0.5" \
  -F "return_crops=false"

Video Detection

curl -X POST "https://algoryn-yolo-e-idcard.hf.space/v1/id/detect-video" \
  -F "file=@your_video.mp4" \
  -F "min_confidence=0.5" \
  -F "sample_fps=2.0" \
  -F "max_detections=10" \
  -F "return_crops=false"

📊 API Endpoints

POST `/v1/id/detect`

Detect European identity documents in uploaded images.

Parameters:

file (required): Image file (JPEG, PNG, etc.)
min_confidence (optional): Minimum confidence threshold (0.0-1.0, default: 0.25)
return_crops (optional): Return cropped document images (default: false)

Response:

{
  "request_id": "uuid",
  "media_type": "image",
  "processing_time": 1.23,
  "detections": [
    {
      "document_type": "identity_card",
      "orientation": "front",
      "confidence": 0.95,
      "bounding_box": {
        "x1": 0.1, "y1": 0.2, "x2": 0.8, "y2": 0.9
      },
      "quality": {
        "sharpness": 0.85,
        "glare_score": 0.1,
        "coverage": 0.75,
        "brightness": 0.6,
        "contrast": 0.7
      },
      "tracking": {
        "track_id": null,
        "is_tracked": false
      },
      "metadata": {
        "class_name": "id_front",
        "original_coordinates": [100, 200, 800, 900],
        "mask_used": false
      }
    }
  ]
}

POST `/v1/id/detect-video`

Detect European identity documents in uploaded videos with quality-based frame selection.

Parameters:

file (required): Video file (MP4, AVI, etc.)
min_confidence (optional): Minimum confidence threshold (0.0-1.0, default: 0.25)
sample_fps (optional): Video sampling rate (0.1-30.0, default: 2.0)
return_crops (optional): Return cropped document images (default: false)
max_detections (optional): Maximum detections to return (1-100, default: 10)

Response:

{
  "request_id": "uuid",
  "media_type": "video",
  "processing_time": 3.45,
  "frame_count": 24,
  "detections": [
    // Same structure as image detection
  ]
}

GET `/health`

Health check endpoint.

Response:

{
  "status": "healthy",
  "version": "2.0.0"
}

🎯 Document Types Supported

Type	Description	Front/Back Detection
`identity_card`	European identity cards	✅
`passport`	Passports	✅
`driver_license`	Driver's licenses	✅
`residence_permit`	Residence permits	✅

🔍 Orientation Classification

The system uses multiple methods for reliable front/back classification:

Class-Based: Uses detected class (id_front, id_back, etc.)
Portrait Detection: Detects faces/portraits using YOLO-E
Heuristic Analysis: Text density, symmetry, and edge pattern analysis

📈 Quality Metrics

Each detection includes comprehensive quality assessment:

Sharpness: Image clarity using Laplacian variance
Glare Score: Bright pixel concentration analysis
Coverage: Document area coverage within bounding box
Brightness: Overall image brightness
Contrast: Image contrast using standard deviation

⚡ Performance

Metric	Target	Notes
Image Processing	<1.5s	Single image detection
Video Processing	<3.0s	Frame extraction and selection
Memory Usage	<3GB	YOLO-E + orientation classifier
Reliability	99.5%	With fallback mechanisms

🛠️ Configuration

Class Mapping

The system uses config/labels.json for class mapping:

{
  "classes": {
    "0": "id_front",
    "1": "id_back", 
    "2": "driver_license",
    "3": "passport",
    "4": "mrz"
  }
}

Model Weights

YOLO-E: yolo11n.pt (nano variant for faster inference)
Orientation Classifier: Integrated ML-based classification

🔧 Deployment

Hugging Face Spaces

Upload the code to a new Hugging Face Space
Set the hardware to GPU for optimal performance
Configure environment variables if needed
Deploy and test the endpoints

GPU Docker Runtime

Ensure host has recent NVIDIA driver installed
Install NVIDIA Container Toolkit on the host
Run the container with GPU access enabled:

# Build image
docker build -t kybtech-yolo-e-idcard:gpu .

# Run with all GPUs and necessary capabilities
docker run --rm \
  --gpus all \
  --ipc=host \
  -p 7860:7860 \
  kybtech-yolo-e-idcard:gpu

Notes:

The Dockerfile uses pytorch/pytorch:2.7.0-cuda12.6-cudnn9-runtime as base (CUDA included).
The app auto-selects GPU if available and performs a warm-up pass.
Verify GPU is visible inside the container with python -c "import torch; print(torch.cuda.is_available())".

Local Development

# Install dependencies
pip install -r requirements.txt

# Run the application
python app.py

📝 Example Usage

Python Client

import requests

# Image detection
with open('document.jpg', 'rb') as f:
    response = requests.post(
        'https://algoryn-yolo-e-idcard.hf.space/v1/id/detect',
        files={'file': f},
        data={'min_confidence': 0.5}
    )

result = response.json()
for detection in result['detections']:
    print(f"Found {detection['document_type']} ({detection['orientation']})")
    print(f"Confidence: {detection['confidence']:.2f}")
    print(f"Quality: {detection['quality']['sharpness']:.2f}")

JavaScript Client

const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('min_confidence', '0.5');

fetch('https://algoryn-yolo-e-idcard.hf.space/v1/id/detect', {
  method: 'POST',
  body: formData
})
.then(response => response.json())
.then(data => {
  data.detections.forEach(detection => {
    console.log(`Found ${detection.document_type} (${detection.orientation})`);
  });
});

🚨 Error Handling

The API returns appropriate HTTP status codes:

200: Success
400: Bad request (invalid parameters)
503: Service unavailable (models not loaded)
500: Internal server error

Error responses include detailed error messages:

{
  "detail": "Detection failed: Invalid image format"
}

🎯 Test Results Summary

✅ Health Check: Space is healthy and running version 2.0.0
✅ Image Detection: Successfully detected identity cards in test images
✅ Video Detection: Processed 8 frames and found 10 detections with tracking
✅ Performance: ~1.1s for images, ~3.3s for videos
✅ Quality Metrics: Comprehensive quality assessment working

🔒 Security & Privacy

No Data Storage: Images/videos are processed in memory only
Temporary Files: Video processing uses temporary files that are immediately cleaned up
No Logging: Sensitive document data is not logged
API Authentication: Configure authentication as needed for your deployment

📊 Monitoring

Monitor the service using:

Health Check: /health endpoint for service status
Processing Time: Included in all responses
Error Rates: Monitor HTTP status codes
Performance: Track response times and memory usage

🎉 Future Enhancements

Real-time Processing: Optimize for live video streams
Multi-country Support: Expand beyond European documents
Advanced Tracking: Implement more sophisticated video tracking
Custom Models: Support for custom document types

This enhanced HF YOLO-E deployment provides production-ready European document detection with advanced ML capabilities and video processing support.

🚀 HF YOLO-E European Document Detection

📋 Table of Contents

✨ Features

🎯 European Document Detection

🎥 Video Processing

🔧 Technical Capabilities

🚀 Quick Start

Health Check

Image Detection

Video Detection

📊 API Endpoints

POST /v1/id/detect

POST /v1/id/detect-video

GET /health

🎯 Document Types Supported

🔍 Orientation Classification

📈 Quality Metrics

⚡ Performance

🛠️ Configuration

Class Mapping

Model Weights

🔧 Deployment

Hugging Face Spaces

GPU Docker Runtime

Local Development

📝 Example Usage

Python Client

JavaScript Client

🚨 Error Handling

🎯 Test Results Summary

🔒 Security & Privacy

📊 Monitoring

🎉 Future Enhancements

POST `/v1/id/detect`

POST `/v1/id/detect-video`

GET `/health`