|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- squirelmail/dataset-BotDetect-CAPTCHA-Generator |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
pipeline_tag: image-text-to-text |
|
|
library_name: keras |
|
|
tags: |
|
|
- ocr |
|
|
- captcha |
|
|
- crnn |
|
|
- ctc |
|
|
- tensorflow |
|
|
- keras |
|
|
- 50x250 |
|
|
- uppercase |
|
|
- digits |
|
|
--- |
|
|
# Model AI For Solve BotDetect-CAPTCHA-Generator Gov ID Captcha |
|
|
|
|
|
π§ CRNN+CTC Checkpoints |
|
|
======================= |
|
|
|
|
|
This directory contains **Keras 3** `save_weights`\-style checkpoints produced during training of a CRNN + CTC model for 5-char uppercase/digit CAPTCHA (image size `H=50`, `W=250`, grayscale). |
|
|
|
|
|
* * * |
|
|
|
|
|
π Contents |
|
|
----------- |
|
|
|
|
|
* `captcha_best.weights.h5` β best validation loss (auto-updated during training). |
|
|
* `captcha_epNNN.weights.h5` β per-epoch snapshots (e.g., `captcha_ep001.weights.h5` β¦ `captcha_ep022.weights.h5`). |
|
|
|
|
|
All files are _weights only_; they must be loaded into the same model architecture used in training (the tester builds that architecture for you). |
|
|
|
|
|
* * * |
|
|
|
|
|
β
Model Result captcha_ep022.weights.h5 => 90.91% Accuracy |
|
|
----------- |
|
|
|
|
|
``` |
|
|
(venv) root@prod-exploit-sa-all-01:/home/infra# date && python3 cek_model_v6.py --weights captcha_ep022.weights.h5 --data-root ./dataset_1000_rand --sample 24000 && date |
|
|
Thu Oct 30 01:12:49 WITA 2025 |
|
|
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR |
|
|
I0000 00:00:1761761571.108235 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used. |
|
|
I0000 00:00:1761761571.304280 2264160 cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. |
|
|
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. |
|
|
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR |
|
|
I0000 00:00:1761761575.452128 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used. |
|
|
Found weights: captcha_ep022.weights.h5 | size: 27757.0 KB | mtime: Thu Oct 30 01:02:51 2025 |
|
|
E0000 00:00:1761761576.513960 2264160 cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303) |
|
|
TF GPUs: [] |
|
|
OK: weights loaded. |
|
|
Base output shape: (None, 31, 37) |
|
|
Testing on 24000 samples from ./dataset_1000_rand ... |
|
|
W0000 00:00:1761761611.159498 2264160 cpu_allocator_impl.cc:84] Allocation of 1200000000 exceeds 10% of free system memory. |
|
|
00 GT: 976VF | Pred: 976VF |
|
|
01 GT: 7W20H | Pred: 7W20H |
|
|
02 GT: UUU24 | Pred: UUU24 |
|
|
03 GT: 1EMVZ | Pred: 1EMVZ |
|
|
04 GT: WY4RD | Pred: WY4RD |
|
|
05 GT: 0GNKE | Pred: 0GNKE |
|
|
06 GT: 7Y5TY | Pred: 7Y5TY |
|
|
07 GT: OC8C1 | Pred: OC8C1 |
|
|
08 GT: 5ZIDQ | Pred: 5ZIDQ |
|
|
09 GT: LP8IP | Pred: LP8IP |
|
|
10 GT: AKQ7G | Pred: AKQ7G |
|
|
11 GT: X23QD | Pred: X23QD |
|
|
|
|
|
Exact match: 90.91% | Mean CER: 0.0194 |
|
|
|
|
|
Total images tested: 24000 |
|
|
|
|
|
Thu Oct 30 01:18:07 WITA 2025 |
|
|
``` |
|
|
|
|
|
* * * |
|
|
|
|
|
π¦ Requirements |
|
|
--------------- |
|
|
|
|
|
Install from the pinned list in the repo root: |
|
|
|
|
|
# (recommended) fresh virtualenv |
|
|
python3 -m venv venv |
|
|
source venv/bin/activate |
|
|
|
|
|
# install exact deps |
|
|
pip install -r captcha_requirements.txt |
|
|
|
|
|
|
|
|
**Important:** Keras/TensorFlow versions should match what was used during training. If you trained with TF/Keras nightly or dev builds, test in the same environment to avoid weight-loading shape/key mismatches. |
|
|
|
|
|
* * * |
|
|
|
|
|
π§ͺ How to Test |
|
|
-------------- |
|
|
|
|
|
The tester script re-creates the training graph (CRNN+CTC), loads the selected checkpoint, and runs inference with the _base_ (CTC-free) submodel. |
|
|
|
|
|
### 1) Single image |
|
|
|
|
|
python3 check_model.py \ |
|
|
--weights /workspace/captcha_final.weights.h5 \ |
|
|
--image /workspace/dataset_500/style7/K9NO2.png |
|
|
|
|
|
|
|
|
Optional ground truth override: |
|
|
|
|
|
python3 check_model.py \ |
|
|
--weights /workspace/captcha_final.weights.h5 \ |
|
|
--image /workspace/dataset_500/style7/K9NO2.png \ |
|
|
--gt K9NO2 |
|
|
|
|
|
|
|
|
### 2) Batch from a dataset |
|
|
|
|
|
python3 check_model.py \ |
|
|
--weights /home/infra/models/captcha_ep002.weights.h5 \ |
|
|
--data-root /datasets/dataset_500 \ |
|
|
--samples 64 |
|
|
|
|
|
|
|
|
Expected directory layout for `--data-root`: |
|
|
|
|
|
/datasets/dataset_500/ |
|
|
βββ style0/ |
|
|
β βββ A1B2C.png |
|
|
β βββ ... |
|
|
βββ style1/ |
|
|
β βββ ... |
|
|
βββ ... |
|
|
βββ style59/ |
|
|
|
|
|
|
|
|
**Image format:** grayscale PNG, resized to `50x250` in the script. |
|
|
**Labels:** derived from filename (regex `^[A-Z0-9]{5}$`). |
|
|
|
|
|
* * * |
|
|
|
|
|
π§© Model Details (for reference) |
|
|
-------------------------------- |
|
|
|
|
|
* Backbone: 3Γ (Conv2D + BN + MaxPool), then reshape to time-steps. |
|
|
* RNN head: 2Γ BiLSTM(128), `return_sequences=True`. |
|
|
* Classifier: Dense(`num_classes = 36 + 1`) with softmax; `+1` is the CTC blank. |
|
|
* Time steps: width is downsampled by 8 β `250/8 = 31` time steps. |
|
|
|
|
|
The tester script internally builds both: `model_with_ctc` (training graph) and `base_model` (inference). It loads weights into the training graph and then uses `base_model` for predictions. |
|
|
|
|
|
* * * |
|
|
|
|
|
ποΈ CLI Options |
|
|
--------------- |
|
|
|
|
|
--weights <path> : required, *.weights.h5 (same architecture) |
|
|
--image <path> : test a single image |
|
|
--gt <text> : ground truth for --image (default: file name) |
|
|
--data-root <dir> : style0..style59 folders for batch testing |
|
|
--samples N : max number of images for batch test (default 64) |
|
|
--height H : input height (default 50) |
|
|
--width W : input width (default 250) |
|
|
--ext png|jpg : image extension for batch (default png) |
|
|
--show K : print K sample predictions (default 12) |
|
|
|
|
|
|
|
|
* * * |
|
|
|
|
|
π Output |
|
|
--------- |
|
|
|
|
|
* Per-sample preview lines: `GT: ABC12 | Pred: ABC12` |
|
|
* Aggregate metrics: |
|
|
* **Exact match** (% of predictions exactly equal to GT) |
|
|
* **Mean CER** (character error rate) |
|
|
|
|
|
* * * |
|
|
|
|
|
π§― Troubleshooting |
|
|
------------------ |
|
|
|
|
|
* **βA total of 1 objects could not be loadedβ¦ <Dense name=predictions>β** |
|
|
Mismatch between Keras/TF versions or model definition. Use the same environment and architecture as training. |
|
|
* **GPU not used** |
|
|
Ensure a CUDA-enabled TF build and matching drivers. For server-side issues, test with: |
|
|
|
|
|
import tensorflow as tf |
|
|
print(tf.config.list_physical_devices('GPU')) |
|
|
|
|
|
* **NaN loss during training** |
|
|
Check: label regex filtering, correct `input_length=31`, use `int32` for CTC inputs, disable LSTM dropouts when using cuDNN (set to `0.0`). |
|
|
|
|
|
* * * |
|
|
|
|
|
π Notes |
|
|
-------- |
|
|
|
|
|
* CTC blank ID = `36` (since charset is 36 chars: 0-9 + A-Z). |
|
|
* All checkpoints here are _weights only_; to export a full model, save the base model as `.keras` after loading weights in the same environment: |
|
|
|
|
|
model_with_ctc, base_model = build_models(...) |
|
|
model_with_ctc.load_weights("captcha_epXXX.weights.h5") |
|
|
base_model.save("captcha_epXXX_base.keras") |