Instructions to use PaddlePaddle/PP-DocLayoutV3_onnx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use PaddlePaddle/PP-DocLayoutV3_onnx with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("object-detection", model="PaddlePaddle/PP-DocLayoutV3_onnx")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("PaddlePaddle/PP-DocLayoutV3_onnx", dtype="auto") - PaddleOCR
How to use PaddlePaddle/PP-DocLayoutV3_onnx with PaddleOCR:
# 1. See https://www.paddlepaddle.org.cn/en/install to install paddlepaddle # 2. pip install paddleocr from paddleocr import LayoutDetection model = LayoutDetection(model_name="PP-DocLayoutV3_onnx") output = model.predict(input="path/to/image.png", batch_size=1) for res in output: res.print() res.save_to_img(save_path="./output/") res.save_to_json(save_path="./output/res.json") - Notebooks
- Google Colab
- Kaggle
Introduction
This is the model weights for PP-DocLayoutv3 in ONNX format. Get PaddlePaddle weights at PP-DocLayoutV3
PP-DocLayoutV3 is specifically engineered to handle non-planar document images. It can directly predict multi-point bounding boxes for layout elements—as opposed to standard two-point boxes—and determine logical reading orders for skewed and curved surfaces within a single forward pass, significantly reducing cascading errors. This model is an essential component of PaddleOCR-VL-1.5, providing crucial layout analysis for the high-precision parsing of various real-world documents in PaddleOCR-VL.
Model Architecture
Model Usage
Install Dependencies
pip install -U paddleocr
pip install -U onnxruntime-gpu
CLI Usage
paddleocr layout_detection -i ./demo.jpg --model_name PP-DocLayoutV3 --engine onnxruntime
Python API Usage
from paddleocr import LayoutDetection
model = LayoutDetection(
model_name="PP-DocLayoutV3",
engine="onnxruntime",
)
output = model.predict("./demo.jpg", batch_size=1)
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/res.json")
Visualization
Light Variation
Skewing
Screen-photo
Curving
Citation
If you find PP-DocLayoutV3 helpful, feel free to give us a star and citation.
@misc{cui2026paddleocrvl15multitask09bvlm,
title={PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing},
author={Cheng Cui and Ting Sun and Suyin Liang and Tingquan Gao and Zelun Zhang and Jiaxuan Liu and Xueqing Wang and Changda Zhou and Hongen Liu and Manhui Lin and Yue Zhang and Yubo Zhang and Yi Liu and Dianhai Yu and Yanjun Ma},
year={2026},
eprint={2601.21957},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2601.21957},
}
Model tree for PaddlePaddle/PP-DocLayoutV3_onnx
Base model
PaddlePaddle/PP-DocLayoutV3