Instructions to use microsoft/trocr-base-printed with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/trocr-base-printed with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="microsoft/trocr-base-printed")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("microsoft/trocr-base-printed") model = AutoModelForImageTextToText.from_pretrained("microsoft/trocr-base-printed") - Notebooks
- Google Colab
- Kaggle
can I use this model to extract text from an entire document?
Hey there, I am working on a PDF parsing project.
Is there a way to use this model to extract an entire page?
OR - are there any other models capable of extracting text from images like these? (don't mind the red rectangle)
I tried other python libraries and the results are bad
P.S. yes, I am using another model to detect tables and remove them in order to improve the parsing
P.P.S. yes, the image above is taken from "attention is all you need" lol
Hey there, I am working on a PDF parsing project.
Is there a way to use this model to extract an entire page?
OR - are there any other models capable of extracting text from images like these? (don't mind the red rectangle)
I tried other python libraries and the results are badP.S. yes, I am using another model to detect tables and remove them in order to improve the parsing
P.P.S. yes, the image above is taken from "attention is all you need" lol
maybe you can try layoutlmv3 ,which can analysis document layout,help detect table ,title,text,etc
At the end did you find an answer for extract an entire page?
At the end did you find an answer for extract an entire page?
i'm trying
