Spaces:
Paused
Dots.OCR API Testing
This directory contains comprehensive testing scripts for the Dots.OCR API endpoint.
Test Scripts
1. test_api_endpoint.py - Comprehensive API Testing
The main testing script that provides full API validation capabilities.
Features:
- Health check validation
- Single and multiple image testing
- ROI (Region of Interest) testing
- Field extraction validation
- Response structure validation
- Performance metrics
- Detailed error reporting
Usage:
# Basic test with default settings
python test_api_endpoint.py
# Test with custom API URL
python test_api_endpoint.py --url https://your-api.example.com
# Test with ROI
python test_api_endpoint.py --roi '{"x1": 0.1, "y1": 0.1, "x2": 0.9, "y2": 0.9}'
# Test with specific expected fields
python test_api_endpoint.py --expected-fields document_number surname given_names
# Verbose output
python test_api_endpoint.py --verbose
# Custom timeout
python test_api_endpoint.py --timeout 60
Options:
--url: API base URL (default: http://localhost:7860)--timeout: Request timeout in seconds (default: 30)--roi: ROI coordinates as JSON string--expected-fields: List of expected field names to validate--verbose: Enable verbose logging
2. quick_test.py - Quick Validation
A simple script for quick API validation after deployment.
Usage:
# Test local API
python quick_test.py
# Test remote API
python quick_test.py https://your-api.example.com
Test Configuration
test_config.json
Configuration file for test parameters and thresholds.
Configuration sections:
api_endpoints: Different API URLs for various environmentstest_images: List of test image filesexpected_fields: Fields that should be extractedroi_test_cases: Different ROI configurations to testperformance_thresholds: Performance validation criteriatest_timeout: Default timeout for requests
Test Images
The following test images are used for validation:
tom_id_card_front.jpg- Front of Dutch ID cardtom_id_card_back.jpg- Back of Dutch ID card
Testing Scenarios
1. Basic Functionality Test
python test_api_endpoint.py
Tests basic API functionality with default settings.
2. ROI Testing
python test_api_endpoint.py --roi '{"x1": 0.25, "y1": 0.25, "x2": 0.75, "y2": 0.75}'
Tests Region of Interest cropping functionality.
3. Field Validation Test
python test_api_endpoint.py --expected-fields document_number surname given_names nationality
Tests that specific fields are extracted correctly.
4. Performance Test
python test_api_endpoint.py --timeout 60 --verbose
Tests API performance with extended timeout and detailed logging.
Expected Results
Successful Test Output
π Checking API health...
β
API is healthy: {'status': 'healthy', 'version': '1.0.0', 'model_loaded': True}
π Starting API tests with 2 images...
β
tom_id_card_front.jpg: 2.45s
β
tom_id_card_back.jpg: 1.23s
π Test Results:
Total images: 2
Successful: 2
Failed: 0
Success rate: 100.0%
Average processing time: 1.84s
π All tests completed successfully!
Field Extraction Example
Page 1: 11 fields extracted
document_number: NLD123456789 (confidence: 0.90)
surname: MULDER (confidence: 0.90)
given_names: THOMAS JAN (confidence: 0.90)
nationality: NLD (confidence: 0.95)
date_of_birth: 15-03-1990 (confidence: 0.90)
gender: M (confidence: 0.95)
Troubleshooting
Common Issues
Connection Refused
- Check if the API is running
- Verify the correct URL and port
- Check firewall settings
Timeout Errors
- Increase timeout with
--timeoutparameter - Check API performance and resource usage
- Increase timeout with
Missing Fields
- Verify test images contain the expected text
- Check field extraction patterns in the code
- Review API logs for processing errors
Validation Errors
- Check API response format
- Verify model is loaded correctly
- Review error logs for details
Debug Mode
Enable verbose logging for detailed debugging:
python test_api_endpoint.py --verbose
Integration with CI/CD
The test scripts can be integrated into CI/CD pipelines:
# Example GitHub Actions step
- name: Test API Endpoint
run: |
python scripts/test_api_endpoint.py --url ${{ env.API_URL }} --timeout 60
Performance Monitoring
The scripts provide performance metrics that can be used for monitoring:
- Processing time per image
- Success rate
- Field extraction accuracy
- Response validation results
These metrics can be integrated with monitoring systems like Prometheus or DataDog.
π Production API Testing
Current Production Endpoint
- URL: https://algoryn-dots-ocr-idcard.hf.space
- Health Check: https://algoryn-dots-ocr-idcard.hf.space/health
- API Docs: https://algoryn-dots-ocr-idcard.hf.space/docs
Quick Production Test
# Test production API
./run_tests.sh -e production
# Quick test with curl (no Python dependencies)
./test_production_curl.sh
Staging Environment
- Staging URL: https://algoryn-dots-ocr-idcard-staging.hf.space (to be created)
- Purpose: Safe testing before production deployment
Environment-Specific Testing
# Test different environments
./run_tests.sh -e local # Local development
./run_tests.sh -e staging # Staging environment
./run_tests.sh -e production # Production environment
5. test_debug_ocr.sh - Per-request debug logging via curl
Use this for quick, dependency-light testing of the server-side debug mode that prints OCR snippets, extracted fields, and MRZ details to logs.
Usage:
# Local server (per-request debug on)
./test_debug_ocr.sh -u http://localhost:7860 -f tom_id_card_front.jpg -d
# Hugging Face Space (replace with your Space URL)
./test_debug_ocr.sh -u https://<your-space>.hf.space -f tom_id_card_front.jpg -d \
-r '{"x1":0,"y1":0,"x2":1,"y2":0.5}'
You can also enable debug globally on the server with DOTS_OCR_DEBUG=1. The script only toggles the request-level flag via -d.