Instructions to use microsoft/layoutxlm-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/layoutxlm-base with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("microsoft/layoutxlm-base", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Add `tokenizer_class` to `config.4.13.0.json`
#2
by SaulLu - opened
Hi π!
I recently noticed that:
from transformers import LayoutXLMProcessor
processor = LayoutXLMProcessor.from_pretrained("microsoft/layoutxlm-base")
was logging the following message
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LayoutLMv2Tokenizer'.
The class this function is called from is 'LayoutXLMTokenizerFast'.
and
tokenizer = AutoTokenizer.from_pretrained("microsoft/layoutxlm-base")
print(type)
is printing
transformers.models.layoutlmv2.tokenization_layoutlmv2_fast.LayoutLMv2TokenizerFast
I think this is because the tokenizer class is not specified in the configuration file and therefore the default class determined is the one of the model, i.e. LayoutLMv2.
What do you think?
nielsr changed pull request status to merged
Thanks for fixing this!