Historical Document Layout Detection Model
A fine-tuned Mask R-CNN model (via LayoutParser/Detectron2) for detecting layout elements in historical Swedish medical journal pages. This is a lighter model and the more complex model can be found here Swemper-layout.
This model was developed as part of the research project:
Communicating Medicine (SweMPer): Digitalisation of Swedish Medical Periodicals, 1781–2011
(Project ID: IN22-0017), funded by Riksbankens Jubileumsfond.
Project page:
Model Details
- Model type: Mask R-CNN (ResNet backbone)
- Framework: Detectron2 / LayoutParser
- Fine-tuned for: Historical document layout analysis
- Language of source documents: Swedish
Label Map
| ID | Label |
|---|---|
| 0 | Advertisement |
| 1 | Author |
| 2 | Header or Footer |
| 3 | Image |
| 4 | List |
| 5 | Page Number |
| 6 | Table |
| 7 | Text |
| 8 | Title |
Evaluation Metrics
The evaluation metrics for this model are as follows:
| AP | AP50 | AP75 | APs | APm | APl |
|---|---|---|---|---|---|
| 64.325 | 88.948 | 69.214 | 40.350 | 55.117 | 67.543 |
Usage
Installation
Follow instructions at:
https://detectron2.readthedocs.io/en/latest/tutorials/install.html
Finetuning
Follow instructions at:
https://detectron2.readthedocs.io/en/latest/tutorials/training.html
Inference
import cv2
import layoutparser as lp
import matplotlib.pyplot as plt
# Configuration
model_config_path = "config_mask_rcnn_resized.yaml"
model_path = "SweMPer-layout-lite.pth"
label_map = {
0: "advertisement",
1: "author",
2: "header_or_footer",
3: "image",
4: "list",
5: "page_no",
6: "table",
7: "text",
8: "title",
}
# Load model
model = lp.models.Detectron2LayoutModel(
config_path=model_config_path,
model_path=model_path,
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
label_map=label_map,
)
# Load and process image
image = cv2.imread("<path_to_image>")
image = image[..., ::-1] # BGR to RGB
# Detect layout
layout = model.detect(image)
# Print detected elements
for block in layout:
print(f"Type: {block.type}, Score: {block.score:.3f}, Box: {block.coordinates}")
# Visualize results
viz = lp.draw_box(image, layout, box_width=3, show_element_type=True)
plt.figure(figsize=(12, 16))
plt.imshow(viz)
plt.axis("off")
plt.show()
Acknowledgements
This work was carried out within the project:
Communicating Medicine (SweMPer): Digitalisation of Swedish Medical Periodicals, 1781–2011
(Project ID: IN22-0017), funded by Riksbankens Jubileumsfond.
We gratefully acknowledge the support of the funder and project collaborators.
This model builds upon the excellent work of:
We thank the contributors and maintainers of these projects for making their tools publicly available and supporting research.
Model tree for cdhu-uu/SweMPer-layout-lite
Base model
layoutparser/detectron2