Wikit
/

pdf-pages-classifier

Image Classification

multi-label-classification

document-understanding

Model card Files Files and versions

pdf-pages-classifier / README.md

mciancone's picture

Update README.md

287bed8 verified 27 days ago

|

history blame contribute delete

2.14 kB

	---
	license: apache-2.0
	pipeline_tag: image-classification
	tags:
	- image-classification
	- multi-label-classification
	- onnx
	- openvino
	- pdf
	- document-understanding
	- rag
	datasets:
	- Wikit/PdfVisClassif
	---

	# PDF Page Classifier

	Multi-label classifier for PDF page images. Determines whether a PDF page
	requires image embedding (vs. text-only) in RAG pipelines.

	Backbone: EfficientNet-Lite0. Exported to ONNX and OpenVINO INT8 via
	Quantization-Aware Training (QAT). No PyTorch required at inference time.

	## Classes

	- `Complex Table`
	- `Simple Table`
	- `Visual - Essential`
	- `Visual - Supportive`

	Pages matching any of the following classes should trigger image embedding:

	- `Complex Table`
	- `Visual - Essential`

	Default threshold: `0.5`

	## Usage

	### With [chunknorris](https://github.com/wikit-ai/chunknorris) (recommended)

	```bash
	pip install "chunknorris[ml-onnx]" # ONNX backend
	pip install "chunknorris[ml-openvino]" # OpenVINO INT8, fastest on CPU
	```

	```python
	from chunknorris.ml import load_classifier

	clf = load_classifier("Wikit/pdf-pages-classifier") # auto-selects best available backend
	result = clf.predict("page.png")
	# {"needs_image_embedding": True, "predicted_classes": [...], "probabilities": {...}}
	```

	### Standalone (no chunknorris)

	```bash
	git clone https://huggingface.co/Wikit/pdf-pages-classifier
	cd pdf-pages-classifier
	pip install onnxruntime Pillow numpy # or: openvino Pillow numpy
	```

	```python
	from classifiers import load_classifier

	clf = load_classifier(".") # auto-selects available backend
	result = clf.predict("page.png")
	```

	## Files

	\| File \| Format \| Notes \|
	\|------\|--------\|-------\|
	\| `model.onnx` \| ONNX FP32 \| Cross-platform CPU/GPU inference \|
	\| `openvino_model.xml/.bin` \| OpenVINO INT8 \| Fastest CPU inference (QAT) \|
	\| `pytorch_model.bin` \| PyTorch \| Raw checkpoint; requires `torch` + `timm` \|
	\| `config.json` \| JSON \| Preprocessing config and class names \|
	\| `classifiers/` \| Python \| Standalone inference scripts (no chunknorris needed) \|

	## Dataset

	Trained on [Wikit/PdfVisClassif](https://huggingface.co/datasets/Wikit/PdfVisClassif).