Hiro-Layout: Document Layout Analysis for Patent and Technical PDFs
English | 简体中文
Hiro-Layout is a document layout analysis model for patent and technical PDF pages. It detects and classifies page regions such as text, titles, headers, footers, tables, formulas, chemical structures, figures, captions, search reports, bibliographies, and other patent-specific layout elements.
Highlights
- Patent-focused layout understanding: covers common patent PDF regions and patent-specific structures.
- Technical document coverage: evaluated on both patent PDFs and NPD PDFs.
- Fine-grained taxonomy: 25 layout categories across figure, text, and complex document elements.
Model Overview
| Item | Details |
|---|---|
| Model name | Hiro-Layout |
| Current artifact | layout_model/RT-DETR_25.onnx |
| Task | Document layout analysis / page region detection |
| Input | Rendered PDF page image |
| Output | Layout regions with class labels |
| Domains | Patent PDFs, technical/NPD PDFs |
| License | Apache-2.0 |
Layout Taxonomy
| Group | Class | Abbr. | Chinese |
|---|---|---|---|
| figure | graph | graph | 图表 |
| figure | drawing | draw | 绘制图 |
| figure | structure diagram | struc | 结构图 |
| figure | photograph | photo | 照片 |
| figure | table | tab | 表格 |
| figure | math equation | eqn | 数学公式 |
| figure | chemical formula | chem | 化学式 |
| figure | noise | noise | 噪声 |
| text | text | text | 文本 |
| text | title | title | 标题 |
| text | section title | sec | 章节标题 |
| text | page header | head | 页眉 |
| text | page footer | foot | 页脚 |
| text | marginal note | mnote | 边注 |
| text | caption | cap | 说明 |
| text | figure number | figno | 编号 |
| text | line number | lineno | 行号 |
| text | column number | colno | 栏号 |
| text | sequence | seq | 序列表 |
| complex | figure complex | figcx | 图片组 |
| complex | chemical reaction | rxn | 反应式 |
| complex | bibliography | bib | 著录页 |
| complex | search report | srep | 搜索报告 |
| complex | Table of Contents | toc | 目录 |
| complex | reference | ref | 参考文献 |
Benchmarks
Metrics are reported as Precision, Recall, and F1.
| Benchmark | Labels | Precision | Recall | F1 |
|---|---|---|---|---|
| Patent PDF | 33,054 | 0.8144 | 0.7711 | 0.7922 |
| NPD PDF | 17,769 | 0.7090 | 0.6983 | 0.7036 |
Patent PDF
| # | Group | Abbr. | Class | Chinese | Labels | Precision | Recall | F1 |
|---|---|---|---|---|---|---|---|---|
| 1 | figure | graph | graph | 图表 | 215 | 0.7611 | 0.8000 | 0.7800 |
| 2 | figure | draw | drawing | 绘制图 | 420 | 0.8649 | 0.3048 | 0.4507 |
| 3 | figure | struc | structure diagram | 结构图 | 626 | 0.6579 | 0.8355 | 0.7361 |
| 4 | figure | photo | photograph | 照片 | 147 | 0.8378 | 0.8435 | 0.8407 |
| 5 | figure | tab | table | 表格 | 198 | 0.7759 | 0.9091 | 0.8372 |
| 6 | figure | eqn | math equation | 数学公式 | 399 | 0.7762 | 0.6692 | 0.7187 |
| 7 | figure | chem | chemical formula | 化学式 | 1,099 | 0.8792 | 0.8944 | 0.8868 |
| 8 | figure | noise | noise | 噪声 | 1,241 | 0.7025 | 0.7687 | 0.7341 |
| 9 | text | text | text | 文本 | 17,668 | 0.8182 | 0.8062 | 0.8122 |
| 10 | text | title | title | 标题 | 601 | 0.9117 | 0.8070 | 0.8561 |
| 11 | text | sec | section title | 章节标题 | 1,394 | 0.7968 | 0.7088 | 0.7502 |
| 12 | text | head | page header | 页眉 | 3,074 | 0.8187 | 0.7788 | 0.7983 |
| 13 | text | foot | page footer | 页脚 | 1,012 | 0.7432 | 0.6433 | 0.6896 |
| 14 | text | mnote | marginal note | 边注 | 421 | 0.7794 | 0.5202 | 0.6239 |
| 15 | text | cap | caption | 说明 | 80 | 0.6842 | 0.4875 | 0.5693 |
| 16 | text | figno | figure number | 编号 | 1,389 | 0.8955 | 0.7466 | 0.8143 |
| 17 | text | lineno | line number | 行号 | 341 | 0.7759 | 0.6598 | 0.7132 |
| 18 | text | colno | column number | 栏号 | 449 | 0.6964 | 0.4699 | 0.5612 |
| 19 | text | seq | sequence | 序列表 | 136 | 0.4430 | 0.2574 | 0.3256 |
| 20 | complex | figcx | figure complex | 图片组 | 1,416 | 0.8657 | 0.7373 | 0.7963 |
| 21 | complex | rxn | chemical reaction | 反应式 | 150 | 0.8898 | 0.7000 | 0.7836 |
| 22 | complex | bib | bibliography | 著录页 | 470 | 0.9615 | 0.7979 | 0.8721 |
| 23 | complex | srep | search report | 搜索报告 | 106 | 0.9052 | 0.9906 | 0.9459 |
| 24 | complex | toc | Table of Contents | 目录 | 0 | 0.0000 | 0.0000 | 0.0000 |
| 25 | complex | ref | reference | 参考文献 | 2 | 0.0000 | 0.0000 | 0.0000 |
| ALL | 33,054 | 0.8144 | 0.7711 | 0.7922 |
NPD PDF
| # | Group | Abbr. | Class | Chinese | Labels | Precision | Recall | F1 |
|---|---|---|---|---|---|---|---|---|
| 1 | figure | graph | graph | 图表 | 248 | 0.6838 | 0.6976 | 0.6906 |
| 2 | figure | draw | drawing | 绘制图 | 9 | 0.0000 | 0.0000 | 0.0000 |
| 3 | figure | struc | structure diagram | 结构图 | 341 | 0.7454 | 0.7126 | 0.7286 |
| 4 | figure | photo | photograph | 照片 | 82 | 0.6071 | 0.6220 | 0.6145 |
| 5 | figure | tab | table | 表格 | 209 | 0.7533 | 0.8182 | 0.7844 |
| 6 | figure | eqn | math equation | 数学公式 | 298 | 0.6789 | 0.5604 | 0.6140 |
| 7 | figure | chem | chemical formula | 化学式 | 388 | 0.7324 | 0.8325 | 0.7793 |
| 8 | figure | noise | noise | 噪声 | 695 | 0.4823 | 0.4302 | 0.4548 |
| 9 | text | text | text | 文本 | 9,119 | 0.6943 | 0.7625 | 0.7268 |
| 10 | text | title | title | 标题 | 304 | 0.7130 | 0.5395 | 0.6142 |
| 11 | text | sec | section title | 章节标题 | 1,539 | 0.7337 | 0.6160 | 0.6697 |
| 12 | text | head | page header | 页眉 | 1,246 | 0.7464 | 0.7111 | 0.7283 |
| 13 | text | foot | page footer | 页脚 | 1,339 | 0.7711 | 0.6468 | 0.7035 |
| 14 | text | mnote | marginal note | 边注 | 190 | 0.5714 | 0.2947 | 0.3889 |
| 15 | text | cap | caption | 说明 | 573 | 0.8711 | 0.5899 | 0.7034 |
| 16 | text | figno | figure number | 编号 | 149 | 0.6078 | 0.4161 | 0.4940 |
| 17 | text | lineno | line number | 行号 | 41 | 0.6667 | 0.9268 | 0.7755 |
| 18 | text | colno | column number | 栏号 | 0 | 0.0000 | 0.0000 | 0.0000 |
| 19 | text | seq | sequence | 序列表 | 18 | 0.7000 | 0.3889 | 0.5000 |
| 20 | complex | figcx | figure complex | 图片组 | 734 | 0.7657 | 0.7480 | 0.7567 |
| 21 | complex | rxn | chemical reaction | 反应式 | 36 | 0.8947 | 0.4722 | 0.6182 |
| 22 | complex | bib | bibliography | 著录页 | 0 | 0.0000 | 0.0000 | 0.0000 |
| 23 | complex | srep | search report | 搜索报告 | 3 | 0.4286 | 1.0000 | 0.6000 |
| 24 | complex | toc | Table of Contents | 目录 | 76 | 0.8475 | 0.6579 | 0.7407 |
| 25 | complex | ref | reference | 参考文献 | 132 | 0.8148 | 0.3333 | 0.4731 |
| ALL | 17,769 | 0.7090 | 0.6983 | 0.7036 |
Usage
The current model artifact is an ONNX export:
layout_model/RT-DETR_25.onnx
Download the repository from the Hugging Face Hub and load the ONNX model with ONNXRuntime:
from pathlib import Path
from huggingface_hub import snapshot_download
import onnxruntime as ort
repo_dir = snapshot_download("PatSnap/Hiro-Layout")
model_path = Path(repo_dir) / "layout_model" / "RT-DETR_25.onnx"
session = ort.InferenceSession(str(model_path))
print("inputs:", [i.name for i in session.get_inputs()])
print("outputs:", [o.name for o in session.get_outputs()])
Use labels.json for the 25-class label mapping.
Repository Files
| File | Purpose |
|---|---|
README.md |
Hugging Face model card in English |
README_zh.md |
Chinese model card |
config.json |
Model metadata used by Hugging Face Hub tooling and download statistics |
EVALUATION.md |
Detailed benchmark results derived from the workbook |
labels.json |
Machine-readable 25-class label mapping |
layout_model/RT-DETR_25.onnx |
ONNX model artifact |
requirements.txt |
Minimal dependencies for ONNX loading and image preprocessing |
LICENSE |
Apache-2.0 license |
DISCLAIMER.md |
Model limitations and responsible-use notes |
NOTICE |
Copyright and trademark notice |
OPEN_SOURCE_CHECKLIST.md |
Release checklist before public upload |
Limitations
- Layout predictions may be inaccurate on low-resolution scans, heavily rotated pages, handwritten documents, unusual patent formats, or unseen page templates.
- Small objects and sparse categories can have unstable metrics when the evaluation set has very few labels.
- The model should not be used as the sole source of truth for legal, compliance, filing, archival, or customer-facing workflows without human review.
- Users are responsible for ensuring they have the right to process and share any documents used with this model.
License
This project is released under the Apache License 2.0. See LICENSE.
Copyright Notice
Copyright (c) 2026 Patsnap. All rights reserved except as expressly licensed under the applicable license terms.
Hiro-Layout, Hiro, Patsnap, and any associated names, logos, product names, service names, designs, and slogans are trademarks or registered trademarks of Patsnap or its affiliates. No trademark license is granted under the open source license or any model license unless expressly stated.
- Downloads last month
- 41