YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
GOT-OCR2 ONNX Export (stepfun-ai/GOT-OCR2_0)
This directory contains ONNX exports produced from stepfun-ai/GOT-OCR2_0 using BaofengZan/GOT-OCRv2-onnx (llm-export).
Contents
- got_ocr2_vision_encoder.onnx (+ .onnx.data): Vision encoder. Input: images; output: visual features. Ready for use with TranslateBlue's GOT-OCR2 ONNX integration.
- got_ocr2_decoder.onnx (optional): Single decoder ONNX; when present, full OCR runs: image β vision encoder β encoder_hidden_states β decoder (BOS then autoregressive) β logits β text. Produce it with
scripts/export_got_ocr2_decoder_onnx.py. - tokenizer.json, tokenizer_config.json: Tokenizer for decode. Generated from the Hugging Face model vocab.
- Split decoder (for reference; app does not load these as a single decoder):
- embedding.onnx, block_0.onnx .. block_23.onnx, lm.onnx, norm.onnx, mm_projector_vary.onnx (each with .onnx.data where applicable).
App compatibility
TranslateBlue's GOTOCR2OCRService expects two ONNX files:
- got_ocr2_vision_encoder.onnx β present; you can run the vision encoder in the app.
- got_ocr2_decoder.onnx β a single decoder ONNX that takes
decoder_input_idsandencoder_hidden_statesand outputslogits. When this file is present (e.g. fromscripts/export_got_ocr2_decoder_onnx.py), full OCR runs: vision encoder β encoder_hidden_states β decoder (with BOS then autoregressive tokens) β logits β decoded text.
See Docs/GOT_OCR2_ONNX_Export.md for I/O names and export options.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support