How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
# Warning: Pipeline type "image-to-text" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline

pipe = pipeline("image-to-text", model="PCS/Extract_Matic")
# Load model directly
from transformers import AutoTokenizer, AutoModelForImageTextToText

tokenizer = AutoTokenizer.from_pretrained("PCS/Extract_Matic")
model = AutoModelForImageTextToText.from_pretrained("PCS/Extract_Matic")
Quick Links

Sparrow - Data extraction from documents with ML

This model is finetuned Donut ML base model on invoices data. Model aims to verify how well Donut performs on enterprise docs.

Mean accuracy on test set: 0.96

Inference:

Inference Results

Training loss:

Training Loss

Sparrow on GitHub

Sample invoice docs to use for inference (docs up to 500 were used for fine-tuning, use docs from 500 for inference)

Our website KatanaML

On Twitter

Downloads last month
30
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train PCS/Extract_Matic