squirelmail's picture
Update README.md
d69f325 verified
---
license: apache-2.0
datasets:
- squirelmail/dataset-BotDetect-CAPTCHA-Generator
language:
- en
metrics:
- accuracy
pipeline_tag: image-text-to-text
library_name: keras
tags:
- ocr
- captcha
- crnn
- ctc
- tensorflow
- keras
- 50x250
- uppercase
- digits
---
# Model AI For Solve BotDetect-CAPTCHA-Generator Gov ID Captcha
🧠 CRNN+CTC Checkpoints
=======================
This directory contains **Keras 3** `save_weights`\-style checkpoints produced during training of a CRNN + CTC model for 5-char uppercase/digit CAPTCHA (image size `H=50`, `W=250`, grayscale).
* * *
πŸ“ Contents
-----------
* `captcha_best.weights.h5` β€” best validation loss (auto-updated during training).
* `captcha_epNNN.weights.h5` β€” per-epoch snapshots (e.g., `captcha_ep001.weights.h5` … `captcha_ep022.weights.h5`).
All files are _weights only_; they must be loaded into the same model architecture used in training (the tester builds that architecture for you).
* * *
βœ… Model Result captcha_ep022.weights.h5 => 90.91% Accuracy
-----------
```
(venv) root@prod-exploit-sa-all-01:/home/infra# date && python3 cek_model_v6.py --weights captcha_ep022.weights.h5 --data-root ./dataset_1000_rand --sample 24000 && date
Thu Oct 30 01:12:49 WITA 2025
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1761761571.108235 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
I0000 00:00:1761761571.304280 2264160 cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1761761575.452128 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
Found weights: captcha_ep022.weights.h5 | size: 27757.0 KB | mtime: Thu Oct 30 01:02:51 2025
E0000 00:00:1761761576.513960 2264160 cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)
TF GPUs: []
OK: weights loaded.
Base output shape: (None, 31, 37)
Testing on 24000 samples from ./dataset_1000_rand ...
W0000 00:00:1761761611.159498 2264160 cpu_allocator_impl.cc:84] Allocation of 1200000000 exceeds 10% of free system memory.
00 GT: 976VF | Pred: 976VF
01 GT: 7W20H | Pred: 7W20H
02 GT: UUU24 | Pred: UUU24
03 GT: 1EMVZ | Pred: 1EMVZ
04 GT: WY4RD | Pred: WY4RD
05 GT: 0GNKE | Pred: 0GNKE
06 GT: 7Y5TY | Pred: 7Y5TY
07 GT: OC8C1 | Pred: OC8C1
08 GT: 5ZIDQ | Pred: 5ZIDQ
09 GT: LP8IP | Pred: LP8IP
10 GT: AKQ7G | Pred: AKQ7G
11 GT: X23QD | Pred: X23QD
Exact match: 90.91% | Mean CER: 0.0194
Total images tested: 24000
Thu Oct 30 01:18:07 WITA 2025
```
* * *
πŸ“¦ Requirements
---------------
Install from the pinned list in the repo root:
# (recommended) fresh virtualenv
python3 -m venv venv
source venv/bin/activate
# install exact deps
pip install -r captcha_requirements.txt
**Important:** Keras/TensorFlow versions should match what was used during training. If you trained with TF/Keras nightly or dev builds, test in the same environment to avoid weight-loading shape/key mismatches.
* * *
πŸ§ͺ How to Test
--------------
The tester script re-creates the training graph (CRNN+CTC), loads the selected checkpoint, and runs inference with the _base_ (CTC-free) submodel.
### 1) Single image
python3 check_model.py \
--weights /workspace/captcha_final.weights.h5 \
--image /workspace/dataset_500/style7/K9NO2.png
Optional ground truth override:
python3 check_model.py \
--weights /workspace/captcha_final.weights.h5 \
--image /workspace/dataset_500/style7/K9NO2.png \
--gt K9NO2
### 2) Batch from a dataset
python3 check_model.py \
--weights /home/infra/models/captcha_ep002.weights.h5 \
--data-root /datasets/dataset_500 \
--samples 64
Expected directory layout for `--data-root`:
/datasets/dataset_500/
β”œβ”€β”€ style0/
β”‚ β”œβ”€β”€ A1B2C.png
β”‚ └── ...
β”œβ”€β”€ style1/
β”‚ └── ...
└── ...
└── style59/
**Image format:** grayscale PNG, resized to `50x250` in the script.
**Labels:** derived from filename (regex `^[A-Z0-9]{5}$`).
* * *
🧩 Model Details (for reference)
--------------------------------
* Backbone: 3Γ— (Conv2D + BN + MaxPool), then reshape to time-steps.
* RNN head: 2Γ— BiLSTM(128), `return_sequences=True`.
* Classifier: Dense(`num_classes = 36 + 1`) with softmax; `+1` is the CTC blank.
* Time steps: width is downsampled by 8 β‡’ `250/8 = 31` time steps.
The tester script internally builds both: `model_with_ctc` (training graph) and `base_model` (inference). It loads weights into the training graph and then uses `base_model` for predictions.
* * *
πŸŽ›οΈ CLI Options
---------------
--weights <path> : required, *.weights.h5 (same architecture)
--image <path> : test a single image
--gt <text> : ground truth for --image (default: file name)
--data-root <dir> : style0..style59 folders for batch testing
--samples N : max number of images for batch test (default 64)
--height H : input height (default 50)
--width W : input width (default 250)
--ext png|jpg : image extension for batch (default png)
--show K : print K sample predictions (default 12)
* * *
πŸ“Š Output
---------
* Per-sample preview lines: `GT: ABC12 | Pred: ABC12`
* Aggregate metrics:
* **Exact match** (% of predictions exactly equal to GT)
* **Mean CER** (character error rate)
* * *
🧯 Troubleshooting
------------------
* **β€œA total of 1 objects could not be loaded… <Dense name=predictions>”**
Mismatch between Keras/TF versions or model definition. Use the same environment and architecture as training.
* **GPU not used**
Ensure a CUDA-enabled TF build and matching drivers. For server-side issues, test with:
import tensorflow as tf
print(tf.config.list_physical_devices('GPU'))
* **NaN loss during training**
Check: label regex filtering, correct `input_length=31`, use `int32` for CTC inputs, disable LSTM dropouts when using cuDNN (set to `0.0`).
* * *
πŸ” Notes
--------
* CTC blank ID = `36` (since charset is 36 chars: 0-9 + A-Z).
* All checkpoints here are _weights only_; to export a full model, save the base model as `.keras` after loading weights in the same environment:
model_with_ctc, base_model = build_models(...)
model_with_ctc.load_weights("captcha_epXXX.weights.h5")
base_model.save("captcha_epXXX_base.keras")