Spaces:

algoryn
/

dots-ocr-idcard

Paused

File size: 6,216 Bytes

# Dots.OCR API Testing

This directory contains comprehensive testing scripts for the Dots.OCR API endpoint.

## Test Scripts

### 1. `test_api_endpoint.py` - Comprehensive API Testing

The main testing script that provides full API validation capabilities.

**Features:**
- Health check validation
- Single and multiple image testing
- ROI (Region of Interest) testing
- Field extraction validation
- Response structure validation
- Performance metrics
- Detailed error reporting

**Usage:**
```bash
# Basic test with default settings
python test_api_endpoint.py

# Test with custom API URL
python test_api_endpoint.py --url https://your-api.example.com

# Test with ROI
python test_api_endpoint.py --roi '{"x1": 0.1, "y1": 0.1, "x2": 0.9, "y2": 0.9}'

# Test with specific expected fields
python test_api_endpoint.py --expected-fields document_number surname given_names

# Verbose output
python test_api_endpoint.py --verbose

# Custom timeout
python test_api_endpoint.py --timeout 60
```

**Options:**
- `--url`: API base URL (default: http://localhost:7860)
- `--timeout`: Request timeout in seconds (default: 30)
- `--roi`: ROI coordinates as JSON string
- `--expected-fields`: List of expected field names to validate
- `--verbose`: Enable verbose logging

### 2. `quick_test.py` - Quick Validation

A simple script for quick API validation after deployment.

**Usage:**
```bash
# Test local API
python quick_test.py

# Test remote API
python quick_test.py https://your-api.example.com
```

## Test Configuration

### `test_config.json`

Configuration file for test parameters and thresholds.

**Configuration sections:**
- `api_endpoints`: Different API URLs for various environments
- `test_images`: List of test image files
- `expected_fields`: Fields that should be extracted
- `roi_test_cases`: Different ROI configurations to test
- `performance_thresholds`: Performance validation criteria
- `test_timeout`: Default timeout for requests

## Test Images

The following test images are used for validation:

- `tom_id_card_front.jpg` - Front of Dutch ID card
- `tom_id_card_back.jpg` - Back of Dutch ID card

## Testing Scenarios

### 1. Basic Functionality Test
```bash
python test_api_endpoint.py
```
Tests basic API functionality with default settings.

### 2. ROI Testing
```bash
python test_api_endpoint.py --roi '{"x1": 0.25, "y1": 0.25, "x2": 0.75, "y2": 0.75}'
```
Tests Region of Interest cropping functionality.

### 3. Field Validation Test
```bash
python test_api_endpoint.py --expected-fields document_number surname given_names nationality
```
Tests that specific fields are extracted correctly.

### 4. Performance Test
```bash
python test_api_endpoint.py --timeout 60 --verbose
```
Tests API performance with extended timeout and detailed logging.

## Expected Results

### Successful Test Output
```
🔍 Checking API health...
✅ API is healthy: {'status': 'healthy', 'version': '1.0.0', 'model_loaded': True}
🚀 Starting API tests with 2 images...
✅ tom_id_card_front.jpg: 2.45s
✅ tom_id_card_back.jpg: 1.23s
📊 Test Results:
   Total images: 2
   Successful: 2
   Failed: 0
   Success rate: 100.0%
   Average processing time: 1.84s
🎉 All tests completed successfully!
```

### Field Extraction Example
```
Page 1: 11 fields extracted
  document_number: NLD123456789 (confidence: 0.90)
  surname: MULDER (confidence: 0.90)
  given_names: THOMAS JAN (confidence: 0.90)
  nationality: NLD (confidence: 0.95)
  date_of_birth: 15-03-1990 (confidence: 0.90)
  gender: M (confidence: 0.95)
```

## Troubleshooting

### Common Issues

1. **Connection Refused**
   - Check if the API is running
   - Verify the correct URL and port
   - Check firewall settings

2. **Timeout Errors**
   - Increase timeout with `--timeout` parameter
   - Check API performance and resource usage

3. **Missing Fields**
   - Verify test images contain the expected text
   - Check field extraction patterns in the code
   - Review API logs for processing errors

4. **Validation Errors**
   - Check API response format
   - Verify model is loaded correctly
   - Review error logs for details

### Debug Mode

Enable verbose logging for detailed debugging:
```bash
python test_api_endpoint.py --verbose
```

## Integration with CI/CD

The test scripts can be integrated into CI/CD pipelines:

```yaml
# Example GitHub Actions step
- name: Test API Endpoint
  run: |
    python scripts/test_api_endpoint.py --url ${{ env.API_URL }} --timeout 60
```

## Performance Monitoring

The scripts provide performance metrics that can be used for monitoring:

- Processing time per image
- Success rate
- Field extraction accuracy
- Response validation results

These metrics can be integrated with monitoring systems like Prometheus or DataDog.

## 🚀 Production API Testing

### Current Production Endpoint
- **URL**: https://algoryn-dots-ocr-idcard.hf.space
- **Health Check**: https://algoryn-dots-ocr-idcard.hf.space/health
- **API Docs**: https://algoryn-dots-ocr-idcard.hf.space/docs

### Quick Production Test
```bash
# Test production API
./run_tests.sh -e production

# Quick test with curl (no Python dependencies)
./test_production_curl.sh
```

### Staging Environment
- **Staging URL**: https://algoryn-dots-ocr-idcard-staging.hf.space (to be created)
- **Purpose**: Safe testing before production deployment

### Environment-Specific Testing
```bash
# Test different environments
./run_tests.sh -e local      # Local development
./run_tests.sh -e staging    # Staging environment
./run_tests.sh -e production # Production environment
```

---

### 5. `test_debug_ocr.sh` - Per-request debug logging via curl

Use this for quick, dependency-light testing of the server-side debug mode that prints OCR snippets, extracted fields, and MRZ details to logs.

**Usage:**
```bash
# Local server (per-request debug on)
./test_debug_ocr.sh -u http://localhost:7860 -f tom_id_card_front.jpg -d

# Hugging Face Space (replace with your Space URL)
./test_debug_ocr.sh -u https://<your-space>.hf.space -f tom_id_card_front.jpg -d \
  -r '{"x1":0,"y1":0,"x2":1,"y2":0.5}'
```

You can also enable debug globally on the server with `DOTS_OCR_DEBUG=1`. The script only toggles the request-level flag via `-d`.