| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - squirelmail/dataset-BotDetect-CAPTCHA-Generator |
| | language: |
| | - en |
| | metrics: |
| | - accuracy |
| | pipeline_tag: image-text-to-text |
| | library_name: keras |
| | tags: |
| | - ocr |
| | - captcha |
| | - crnn |
| | - ctc |
| | - tensorflow |
| | - keras |
| | - 50x250 |
| | - uppercase |
| | - digits |
| | --- |
| | # Model AI For Solve BotDetect-CAPTCHA-Generator Gov ID Captcha |
| |
|
| | π§ CRNN+CTC Checkpoints |
| | ======================= |
| |
|
| | This directory contains **Keras 3** `save_weights`\-style checkpoints produced during training of a CRNN + CTC model for 5-char uppercase/digit CAPTCHA (image size `H=50`, `W=250`, grayscale). |
| |
|
| | * * * |
| |
|
| | π Contents |
| | ----------- |
| |
|
| | * `captcha_best.weights.h5` β best validation loss (auto-updated during training). |
| | * `captcha_epNNN.weights.h5` β per-epoch snapshots (e.g., `captcha_ep001.weights.h5` β¦ `captcha_ep022.weights.h5`). |
| |
|
| | All files are _weights only_; they must be loaded into the same model architecture used in training (the tester builds that architecture for you). |
| |
|
| | * * * |
| |
|
| | β
Model Result captcha_ep022.weights.h5 => 90.91% Accuracy |
| | ----------- |
| | |
| | ``` |
| | (venv) root@prod-exploit-sa-all-01:/home/infra# date && python3 cek_model_v6.py --weights captcha_ep022.weights.h5 --data-root ./dataset_1000_rand --sample 24000 && date |
| | Thu Oct 30 01:12:49 WITA 2025 |
| | WARNING: All log messages before absl::InitializeLog() is called are written to STDERR |
| | I0000 00:00:1761761571.108235 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used. |
| | I0000 00:00:1761761571.304280 2264160 cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. |
| | To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. |
| | WARNING: All log messages before absl::InitializeLog() is called are written to STDERR |
| | I0000 00:00:1761761575.452128 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used. |
| | Found weights: captcha_ep022.weights.h5 | size: 27757.0 KB | mtime: Thu Oct 30 01:02:51 2025 |
| | E0000 00:00:1761761576.513960 2264160 cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303) |
| | TF GPUs: [] |
| | OK: weights loaded. |
| | Base output shape: (None, 31, 37) |
| | Testing on 24000 samples from ./dataset_1000_rand ... |
| | W0000 00:00:1761761611.159498 2264160 cpu_allocator_impl.cc:84] Allocation of 1200000000 exceeds 10% of free system memory. |
| | 00 GT: 976VF | Pred: 976VF |
| | 01 GT: 7W20H | Pred: 7W20H |
| | 02 GT: UUU24 | Pred: UUU24 |
| | 03 GT: 1EMVZ | Pred: 1EMVZ |
| | 04 GT: WY4RD | Pred: WY4RD |
| | 05 GT: 0GNKE | Pred: 0GNKE |
| | 06 GT: 7Y5TY | Pred: 7Y5TY |
| | 07 GT: OC8C1 | Pred: OC8C1 |
| | 08 GT: 5ZIDQ | Pred: 5ZIDQ |
| | 09 GT: LP8IP | Pred: LP8IP |
| | 10 GT: AKQ7G | Pred: AKQ7G |
| | 11 GT: X23QD | Pred: X23QD |
| | |
| | Exact match: 90.91% | Mean CER: 0.0194 |
| | |
| | Total images tested: 24000 |
| | |
| | Thu Oct 30 01:18:07 WITA 2025 |
| | ``` |
| | |
| | * * * |
| | |
| | π¦ Requirements |
| | --------------- |
| |
|
| | Install from the pinned list in the repo root: |
| |
|
| | # (recommended) fresh virtualenv |
| | python3 -m venv venv |
| | source venv/bin/activate |
| | |
| | # install exact deps |
| | pip install -r captcha_requirements.txt |
| | |
| | |
| | **Important:** Keras/TensorFlow versions should match what was used during training. If you trained with TF/Keras nightly or dev builds, test in the same environment to avoid weight-loading shape/key mismatches. |
| |
|
| | * * * |
| |
|
| | π§ͺ How to Test |
| | -------------- |
| |
|
| | The tester script re-creates the training graph (CRNN+CTC), loads the selected checkpoint, and runs inference with the _base_ (CTC-free) submodel. |
| |
|
| | ### 1) Single image |
| |
|
| | python3 check_model.py \ |
| | --weights /workspace/captcha_final.weights.h5 \ |
| | --image /workspace/dataset_500/style7/K9NO2.png |
| | |
| | |
| | Optional ground truth override: |
| |
|
| | python3 check_model.py \ |
| | --weights /workspace/captcha_final.weights.h5 \ |
| | --image /workspace/dataset_500/style7/K9NO2.png \ |
| | --gt K9NO2 |
| | |
| | |
| | ### 2) Batch from a dataset |
| |
|
| | python3 check_model.py \ |
| | --weights /home/infra/models/captcha_ep002.weights.h5 \ |
| | --data-root /datasets/dataset_500 \ |
| | --samples 64 |
| | |
| | |
| | Expected directory layout for `--data-root`: |
| |
|
| | /datasets/dataset_500/ |
| | βββ style0/ |
| | β βββ A1B2C.png |
| | β βββ ... |
| | βββ style1/ |
| | β βββ ... |
| | βββ ... |
| | βββ style59/ |
| | |
| | |
| | **Image format:** grayscale PNG, resized to `50x250` in the script. |
| | **Labels:** derived from filename (regex `^[A-Z0-9]{5}$`). |
| |
|
| | * * * |
| |
|
| | π§© Model Details (for reference) |
| | -------------------------------- |
| |
|
| | * Backbone: 3Γ (Conv2D + BN + MaxPool), then reshape to time-steps. |
| | * RNN head: 2Γ BiLSTM(128), `return_sequences=True`. |
| | * Classifier: Dense(`num_classes = 36 + 1`) with softmax; `+1` is the CTC blank. |
| | * Time steps: width is downsampled by 8 β `250/8 = 31` time steps. |
| |
|
| | The tester script internally builds both: `model_with_ctc` (training graph) and `base_model` (inference). It loads weights into the training graph and then uses `base_model` for predictions. |
| |
|
| | * * * |
| |
|
| | ποΈ CLI Options |
| | --------------- |
| |
|
| | --weights <path> : required, *.weights.h5 (same architecture) |
| | --image <path> : test a single image |
| | --gt <text> : ground truth for --image (default: file name) |
| | --data-root <dir> : style0..style59 folders for batch testing |
| | --samples N : max number of images for batch test (default 64) |
| | --height H : input height (default 50) |
| | --width W : input width (default 250) |
| | --ext png|jpg : image extension for batch (default png) |
| | --show K : print K sample predictions (default 12) |
| | |
| | |
| | * * * |
| |
|
| | π Output |
| | --------- |
| |
|
| | * Per-sample preview lines: `GT: ABC12 | Pred: ABC12` |
| | * Aggregate metrics: |
| | * **Exact match** (% of predictions exactly equal to GT) |
| | * **Mean CER** (character error rate) |
| |
|
| | * * * |
| |
|
| | π§― Troubleshooting |
| | ------------------ |
| |
|
| | * **βA total of 1 objects could not be loadedβ¦ <Dense name=predictions>β** |
| | Mismatch between Keras/TF versions or model definition. Use the same environment and architecture as training. |
| | * **GPU not used** |
| | Ensure a CUDA-enabled TF build and matching drivers. For server-side issues, test with: |
| | |
| | import tensorflow as tf |
| | print(tf.config.list_physical_devices('GPU')) |
| | |
| | * **NaN loss during training** |
| | Check: label regex filtering, correct `input_length=31`, use `int32` for CTC inputs, disable LSTM dropouts when using cuDNN (set to `0.0`). |
| | |
| | * * * |
| |
|
| | π Notes |
| | -------- |
| |
|
| | * CTC blank ID = `36` (since charset is 36 chars: 0-9 + A-Z). |
| | * All checkpoints here are _weights only_; to export a full model, save the base model as `.keras` after loading weights in the same environment: |
| | |
| | model_with_ctc, base_model = build_models(...) |
| | model_with_ctc.load_weights("captcha_epXXX.weights.h5") |
| | base_model.save("captcha_epXXX_base.keras") |