File size: 6,438 Bytes
0a7e5ec
 
fc9d132
0a7e5ec
 
 
 
 
a773fc6
0a7e5ec
 
 
 
f256ddd
0a7e5ec
f256ddd
0a7e5ec
f256ddd
 
 
 
0a7e5ec
f256ddd
 
 
 
 
0a7e5ec
f256ddd
 
 
 
0a7e5ec
f256ddd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a7e5ec
 
f256ddd
0a7e5ec
 
f256ddd
0a7e5ec
 
 
 
 
 
 
f256ddd
 
 
 
 
 
 
0a7e5ec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f256ddd
0a7e5ec
f256ddd
 
 
0a7e5ec
f256ddd
 
 
 
 
 
 
 
0a7e5ec
f256ddd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a7e5ec
 
 
 
 
 
f256ddd
 
 
 
 
 
 
0a7e5ec
f256ddd
 
 
 
0a7e5ec
f256ddd
 
 
 
 
 
 
 
 
211e423
f256ddd
211e423
 
f256ddd
211e423
 
 
 
 
 
 
f256ddd
 
211e423
f256ddd
211e423
 
 
f256ddd
211e423
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f256ddd
 
211e423
 
f256ddd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a7e5ec
f256ddd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
---
title: KYB Dots.OCR Text Extraction
emoji: πŸ–¨οΈ
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: "other"
---

# KYB Dots.OCR Text Extraction

This [Hugging Face Space](https://huggingface.co/docs/hub/spaces) provides a FastAPI endpoint for text extraction from identity documents using Dots.OCR with ROI (Region of Interest) support. Built as a Docker Space for maximum flexibility and performance.

## πŸš€ Quick Start

### Using the API
1. **Upload an image** (JPEG, PNG, or other supported formats)
2. **Optionally specify ROI** coordinates for targeted extraction
3. **Get structured results** with confidence scores and field mapping

### Test the API
```bash
# Basic OCR test
curl -X POST https://algoryn-dots-ocr-idcard.hf.space/v1/id/ocr \
  -F "file=@test_image.jpg"

# With ROI (region of interest)
curl -X POST https://algoryn-dots-ocr-idcard.hf.space/v1/id/ocr \
  -F "file=@test_image.jpg" \
  -F 'roi={"x1":0.1,"y1":0.1,"x2":0.9,"y2":0.9}'
```

## ✨ Features

- **πŸ” Text Extraction**: Extract text from identity documents using Dots.OCR
- **πŸ“ ROI Support**: Process pre-cropped images or full images with ROI coordinates
- **πŸ“‹ Field Mapping**: Structured field extraction with confidence scores
- **πŸ†” MRZ Detection**: Machine Readable Zone data extraction
- **πŸ”Œ Standardized API**: Consistent response format for integration
- **🐳 Docker-based**: Full control over dependencies and environment
- **⚑ GPU Support**: Optimized for Hugging Face Spaces GPU instances

## πŸ“‘ API Endpoints

### Health Check
```http
GET /health
```
Returns service status and version information.

### Text Extraction
```http
POST /v1/id/ocr
Content-Type: multipart/form-data

file: <image_file>
roi: {"x1": 0.0, "y1": 0.0, "x2": 1.0, "y2": 1.0} (optional)
```

**Parameters:**
- `file`: Image file to process (required)
- `roi`: JSON string with normalized coordinates (optional)
  - `x1`, `y1`: Top-left corner (0.0 to 1.0)
  - `x2`, `y2`: Bottom-right corner (0.0 to 1.0)

## πŸ“„ Response Format

```json
{
  "request_id": "uuid",
  "media_type": "image",
  "processing_time": 0.456,
  "detections": [
    {
      "mrz_data": {
        "document_type": "TD3",
        "issuing_country": "NLD",
        "surname": "MULDER",
        "given_names": "THOMAS",
        "document_number": "NLD123456789",
        "nationality": "NLD",
        "date_of_birth": "1990-01-01",
        "gender": "M",
        "date_of_expiry": "2030-01-01",
        "personal_number": "123456789",
        "raw_mrz": "P<NLDMULDER<<THOMAS<<<<<<<<<<<<<<<<<<<<<<<<<",
        "confidence": 0.95
      },
      "extracted_fields": {
        "document_number": {
          "field_name": "document_number",
          "value": "NLD123456789",
          "confidence": 0.92,
          "source": "ocr"
        },
        "surname": {
          "field_name": "surname",
          "value": "MULDER",
          "confidence": 0.96,
          "source": "ocr"
        }
      }
    }
  ]
}
```

## πŸ› οΈ Deployment to Hugging Face Spaces

### Prerequisites
- [Hugging Face CLI](https://huggingface.co/docs/hub/install-huggingface-cli) installed
- Docker installed locally (for testing)

### 1. Create HF Space
```bash
# Login to Hugging Face
huggingface-cli login

# Create a new Docker Space
huggingface-cli repo create dots-ocr-idcard --type space --space_sdk docker --organization algoryn
```

### 2. Clone and Setup
```bash
# Clone the space locally
git clone https://huggingface.co/spaces/algoryn/dots-ocr-idcard
cd dots-ocr-idcard

# Copy required files
cp /path/to/kybtech-ml-pipelines/docker/hf/dots-ocr/* .

# Copy field extraction module
mkdir -p src/idcard_api
cp /path/to/kybtech-ml-pipelines/src/idcard_api/field_extraction.py src/idcard_api/
touch src/idcard_api/__init__.py
```

### 3. Deploy
```bash
git add .
git commit -m "Deploy Dots-OCR text extraction service"
git push
```

### 4. Test Deployment
The Space will be available at `https://algoryn-dots-ocr-idcard.hf.space` after deployment (usually 5-10 minutes).

## βš™οΈ Configuration

### Environment Variables
- `HF_DOTS_MODEL_PATH`: Path to Dots.OCR model weights
- `HF_DOTS_CONFIDENCE_THRESHOLD`: Confidence threshold for field extraction
- `HF_DOTS_DEVICE`: Device to use (auto, cpu, cuda)
- `HF_DOTS_MAX_IMAGE_SIZE`: Maximum image size for processing
- `HF_DOTS_MRZ_ENABLED`: Enable MRZ detection

### Hugging Face Spaces Settings
- **SDK**: Docker
- **Port**: 7860 (default)
- **Hardware**: CPU (upgradeable to GPU)
- **Storage**: Persistent storage available for model caching

## πŸ“Š Performance

| Hardware | Processing Time | Memory Usage |
|----------|----------------|--------------|
| **GPU** | 300-900ms | ~6GB |
| **CPU** | 3-8s | ~2GB |

## πŸ”’ Privacy & Security

- **No Data Storage**: Images are processed temporarily and not stored
- **Privacy Protection**: All field values are redacted in logs
- **Secure Processing**: Runs in isolated Docker containers
- **No Tracking**: No user data or usage analytics collected

## 🐳 Local Development

### Quick Start with uv
```bash
# Set up development environment
make setup

# Activate virtual environment
source .venv/bin/activate  # On Unix/macOS
# or
.venv\Scripts\activate     # On Windows

# Run the application
make run-dev
```

### Docker Development
```bash
# Build and run with Docker
make build
make run-docker

# View logs
make logs
```

### Development Commands
```bash
# Run tests
make test

# Format code
make format

# Run linting
make lint

# Test API endpoints
make test-local
make test-production
```

For detailed development instructions, see the documentation in `docs/`.

## πŸ“š Documentation

- [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
- [Docker Spaces Guide](https://huggingface.co/docs/hub/spaces-sdks-docker)
- [FastAPI Documentation](https://fastapi.tiangolo.com/)

## 🀝 Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request

## πŸ“„ License

This project is licensed under a private license. See the license file for details.

## πŸ†˜ Support

- **Issues**: Report bugs and request features via GitHub Issues
- **Discussions**: Join the community discussions
- **Email**: Contact us at website@huggingface.co for advanced support

---

Built with ❀️ using [Hugging Face Spaces](https://huggingface.co/docs/hub/spaces) and FastAPI