Update README.md

4061dc6 verified 7 months ago

4.85 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- openmmlab/mask-rcnn
	- microsoft/swin-base-patch4-window7-224-in22k
	pipeline_tag: image-segmentation
	---

	# Model Card for ChartPointNet-InstanceSeg

	ChartPointNet-InstanceSeg is a high-precision data point instance segmentation model for scientific charts. It uses Mask R-CNN with a Swin Transformer backbone to detect and segment individual data points, especially in dense and small-object scenarios common in scientific figures.

	## Model Details

	### Model Description

	ChartPointNet-InstanceSeg is designed for pixel-precise instance segmentation of data points in scientific charts (e.g., scatter plots). It leverages Mask R-CNN with a Swin Transformer backbone, trained on enhanced COCO-style datasets with instance masks for data points. The model is ideal for extracting quantitative data from scientific figures and for downstream chart analysis.

	- Developed by: Hansheng Zhu
	- Model type: Instance Segmentation
	- License: Apache-2.0
	- Finetuned from model: openmmlab/mask-rcnn

	### Model Sources

	- Repository: [https://github.com/hanszhu/ChartSense](https://github.com/hanszhu/ChartSense)
	- Paper: https://arxiv.org/abs/2106.01841

	## Uses

	### Direct Use

	- Instance segmentation of data points in scientific charts
	- Automated extraction of quantitative data from figures
	- Preprocessing for downstream chart understanding and data mining

	### Downstream Use

	- As a preprocessing step for chart structure parsing or data extraction
	- Integration into document parsing, digital library, or accessibility systems

	### Out-of-Scope Use

	- Segmentation of non-data-point elements
	- Use on figures outside the supported chart types
	- Medical or legal decision making

	## Bias, Risks, and Limitations

	- The model is limited to data point segmentation in scientific charts.
	- May not generalize to figures with highly unusual styles or poor image quality.
	- Potential dataset bias: Training data is sourced from scientific literature.

	### Recommendations

	Users should verify predictions on out-of-domain data and be aware of the model’s limitations regarding chart style and domain.

	## How to Get Started with the Model

	```python
	import torch
	from mmdet.apis import inference_detector, init_detector

	config_file = 'legend_match_swin/mask_rcnn_swin_datapoint.py'
	checkpoint_file = 'chart_datapoint.pth'
	model = init_detector(config_file, checkpoint_file, device='cuda:0')

	result = inference_detector(model, 'example_chart.png')
	# result: list of detected masks and class labels
	```

	## Training Details

	### Training Data

	- Dataset: Enhanced COCO-style scientific chart dataset with instance masks
	- Data point class with pixel-precise segmentation masks
	- Images and annotations filtered and preprocessed for optimal Swin Transformer performance

	### Training Procedure

	- Images resized to 1120x672
	- Mask R-CNN with Swin Transformer backbone
	- Training regime: fp32
	- Optimizer: AdamW
	- Batch size: 8
	- Epochs: 36
	- Learning rate: 1e-4

	## Evaluation

	### Testing Data, Factors & Metrics

	- Testing Data: Held-out split from enhanced COCO-style dataset
	- Factors: Data point density, image quality
	- Metrics: mAP (mean Average Precision), AP50, AP75, per-class AP

	### Results

	\| Category \| mAP \| mAP_50 \| mAP_75 \| mAP_s \| mAP_m \| mAP_l \|
	\|-----------------\|-------\|--------\|--------\|-------\|-------\|-------\|
	\| data-point \| 0.485 \| 0.687 \| 0.581 \| 0.487 \| 0.05 \| nan \|

	#### Summary

	The model achieves strong mAP for data point segmentation, excelling in dense and small-object scenarios. It is highly effective for scientific figures requiring pixel-level accuracy.

	## Environmental Impact

	- Hardware Type: NVIDIA V100 GPU
	- Hours used: 10
	- Cloud Provider: Google Cloud
	- Compute Region: us-central1
	- Carbon Emitted: ~15 kg CO2eq (estimated)

	## Technical Specifications

	### Model Architecture and Objective

	- Mask R-CNN with Swin Transformer backbone
	- Instance segmentation head for data point class

	### Compute Infrastructure

	- Hardware: NVIDIA V100 GPU
	- Software: PyTorch 1.13, MMDetection 2.x, Python 3.9

	## Citation

	BibTeX:

	```bibtex
	@article{DocFigure2021,
	title={DocFigure: A Dataset for Scientific Figure Classification},
	author={S. Afzal, et al.},
	journal={arXiv preprint arXiv:2106.01841},
	year={2021}
	}
	```

	APA:

	Afzal, S., et al. (2021). DocFigure: A Dataset for Scientific Figure Classification. arXiv preprint arXiv:2106.01841.

	## Glossary

	- Data Point: An individual visual marker representing a value in a scientific chart (e.g., a dot in a scatter plot)

	## More Information

	- [DocFigure Paper](https://arxiv.org/abs/2106.01841)

	## Model Card Authors

	Hansheng Zhu

	## Model Card Contact

	hanszhu05@gmail.com