Layout Analysis Module of PaddleOCR-VL-1.5

🔥 Official Website | 📝 Technical Report

Introduction

This is the model weights for PP-DocLayoutv3 in ONNX format. Get PaddlePaddle weights at PP-DocLayoutV3

PP-DocLayoutV3 is specifically engineered to handle non-planar document images. It can directly predict multi-point bounding boxes for layout elements—as opposed to standard two-point boxes—and determine logical reading orders for skewed and curved surfaces within a single forward pass, significantly reducing cascading errors. This model is an essential component of PaddleOCR-VL-1.5, providing crucial layout analysis for the high-precision parsing of various real-world documents in PaddleOCR-VL.

Model Architecture

Model Usage

Install Dependencies

pip install -U paddleocr
pip install -U onnxruntime-gpu

CLI Usage

paddleocr layout_detection -i ./demo.jpg --model_name PP-DocLayoutV3 --engine onnxruntime

Python API Usage

from paddleocr import LayoutDetection

model = LayoutDetection(
    model_name="PP-DocLayoutV3",
    engine="onnxruntime",
)
output = model.predict("./demo.jpg", batch_size=1)
for res in output:
    res.print()
    res.save_to_img(save_path="./output/")
    res.save_to_json(save_path="./output/res.json")

Visualization

Light Variation

Skewing

Screen-photo

Curving

Citation

If you find PP-DocLayoutV3 helpful, feel free to give us a star and citation.

@misc{cui2026paddleocrvl15multitask09bvlm,
      title={PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing},
      author={Cheng Cui and Ting Sun and Suyin Liang and Tingquan Gao and Zelun Zhang and Jiaxuan Liu and Xueqing Wang and Changda Zhou and Hongen Liu and Manhui Lin and Yue Zhang and Yubo Zhang and Yi Liu and Dianhai Yu and Yanjun Ma},
      year={2026},
      eprint={2601.21957},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2601.21957},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for PaddlePaddle/PP-DocLayoutV3_onnx

Base model

PaddlePaddle/PP-DocLayoutV3

Quantized

(2)

this model

Paper for PaddlePaddle/PP-DocLayoutV3_onnx

PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing

Paper • 2601.21957 • Published Jan 29 • 23