Update README.md

d69f325 verified 4 months ago

6.96 kB

	---
	license: apache-2.0
	datasets:
	- squirelmail/dataset-BotDetect-CAPTCHA-Generator
	language:
	- en
	metrics:
	- accuracy
	pipeline_tag: image-text-to-text
	library_name: keras
	tags:
	- ocr
	- captcha
	- crnn
	- ctc
	- tensorflow
	- keras
	- 50x250
	- uppercase
	- digits
	---
	# Model AI For Solve BotDetect-CAPTCHA-Generator Gov ID Captcha

	🧠 CRNN+CTC Checkpoints
	=======================

	This directory contains Keras 3 `save_weights`\-style checkpoints produced during training of a CRNN + CTC model for 5-char uppercase/digit CAPTCHA (image size `H=50`, `W=250`, grayscale).

	* * *

	📁 Contents
	-----------

	* `captcha_best.weights.h5` — best validation loss (auto-updated during training).
	* `captcha_epNNN.weights.h5` — per-epoch snapshots (e.g., `captcha_ep001.weights.h5` … `captcha_ep022.weights.h5`).

	All files are _weights only_; they must be loaded into the same model architecture used in training (the tester builds that architecture for you).

	* * *

	✅ Model Result captcha_ep022.weights.h5 => 90.91% Accuracy
	-----------

	```
	(venv) root@prod-exploit-sa-all-01:/home/infra# date && python3 cek_model_v6.py --weights captcha_ep022.weights.h5 --data-root ./dataset_1000_rand --sample 24000 && date
	Thu Oct 30 01:12:49 WITA 2025
	WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
	I0000 00:00:1761761571.108235 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
	I0000 00:00:1761761571.304280 2264160 cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
	To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
	WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
	I0000 00:00:1761761575.452128 2264160 cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
	Found weights: captcha_ep022.weights.h5 \| size: 27757.0 KB \| mtime: Thu Oct 30 01:02:51 2025
	E0000 00:00:1761761576.513960 2264160 cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)
	TF GPUs: []
	OK: weights loaded.
	Base output shape: (None, 31, 37)
	Testing on 24000 samples from ./dataset_1000_rand ...
	W0000 00:00:1761761611.159498 2264160 cpu_allocator_impl.cc:84] Allocation of 1200000000 exceeds 10% of free system memory.
	00 GT: 976VF \| Pred: 976VF
	01 GT: 7W20H \| Pred: 7W20H
	02 GT: UUU24 \| Pred: UUU24
	03 GT: 1EMVZ \| Pred: 1EMVZ
	04 GT: WY4RD \| Pred: WY4RD
	05 GT: 0GNKE \| Pred: 0GNKE
	06 GT: 7Y5TY \| Pred: 7Y5TY
	07 GT: OC8C1 \| Pred: OC8C1
	08 GT: 5ZIDQ \| Pred: 5ZIDQ
	09 GT: LP8IP \| Pred: LP8IP
	10 GT: AKQ7G \| Pred: AKQ7G
	11 GT: X23QD \| Pred: X23QD

	Exact match: 90.91% \| Mean CER: 0.0194

	Total images tested: 24000

	Thu Oct 30 01:18:07 WITA 2025
	```

	* * *

	📦 Requirements
	---------------

	Install from the pinned list in the repo root:

	# (recommended) fresh virtualenv
	python3 -m venv venv
	source venv/bin/activate

	# install exact deps
	pip install -r captcha_requirements.txt


	Important: Keras/TensorFlow versions should match what was used during training. If you trained with TF/Keras nightly or dev builds, test in the same environment to avoid weight-loading shape/key mismatches.

	* * *

	🧪 How to Test
	--------------

	The tester script re-creates the training graph (CRNN+CTC), loads the selected checkpoint, and runs inference with the _base_ (CTC-free) submodel.

	### 1) Single image

	python3 check_model.py \
	--weights /workspace/captcha_final.weights.h5 \
	--image /workspace/dataset_500/style7/K9NO2.png


	Optional ground truth override:

	python3 check_model.py \
	--weights /workspace/captcha_final.weights.h5 \
	--image /workspace/dataset_500/style7/K9NO2.png \
	--gt K9NO2


	### 2) Batch from a dataset

	python3 check_model.py \
	--weights /home/infra/models/captcha_ep002.weights.h5 \
	--data-root /datasets/dataset_500 \
	--samples 64


	Expected directory layout for `--data-root`:

	/datasets/dataset_500/
	├── style0/
	│ ├── A1B2C.png
	│ └── ...
	├── style1/
	│ └── ...
	└── ...
	└── style59/


	Image format: grayscale PNG, resized to `50x250` in the script.
	Labels: derived from filename (regex `^[A-Z0-9]{5}$`).

	* * *

	🧩 Model Details (for reference)
	--------------------------------

	* Backbone: 3× (Conv2D + BN + MaxPool), then reshape to time-steps.
	* RNN head: 2× BiLSTM(128), `return_sequences=True`.
	* Classifier: Dense(`num_classes = 36 + 1`) with softmax; `+1` is the CTC blank.
	* Time steps: width is downsampled by 8 ⇒ `250/8 = 31` time steps.

	The tester script internally builds both: `model_with_ctc` (training graph) and `base_model` (inference). It loads weights into the training graph and then uses `base_model` for predictions.

	* * *

	🎛️ CLI Options
	---------------

	--weights <path> : required, *.weights.h5 (same architecture)
	--image <path> : test a single image
	--gt <text> : ground truth for --image (default: file name)
	--data-root <dir> : style0..style59 folders for batch testing
	--samples N : max number of images for batch test (default 64)
	--height H : input height (default 50)
	--width W : input width (default 250)
	--ext png\|jpg : image extension for batch (default png)
	--show K : print K sample predictions (default 12)


	* * *

	📊 Output
	---------

	* Per-sample preview lines: `GT: ABC12 \| Pred: ABC12`
	* Aggregate metrics:
	* Exact match (% of predictions exactly equal to GT)
	* Mean CER (character error rate)

	* * *

	🧯 Troubleshooting
	------------------

	* “A total of 1 objects could not be loaded… <Dense name=predictions>”
	Mismatch between Keras/TF versions or model definition. Use the same environment and architecture as training.
	* GPU not used
	Ensure a CUDA-enabled TF build and matching drivers. For server-side issues, test with:

	import tensorflow as tf
	print(tf.config.list_physical_devices('GPU'))

	* NaN loss during training
	Check: label regex filtering, correct `input_length=31`, use `int32` for CTC inputs, disable LSTM dropouts when using cuDNN (set to `0.0`).

	* * *

	🔐 Notes
	--------

	* CTC blank ID = `36` (since charset is 36 chars: 0-9 + A-Z).
	* All checkpoints here are _weights only_; to export a full model, save the base model as `.keras` after loading weights in the same environment:

	model_with_ctc, base_model = build_models(...)
	model_with_ctc.load_weights("captcha_epXXX.weights.h5")
	base_model.save("captcha_epXXX_base.keras")