OPUS-MT ONNX Model Hub
Mirror of VaishalBusiness/opus (VGT ONNX Model Hub): 1,000+ OPUS-MT translation models in ONNX format for fast inference.
Highlights
- 1,000+ ONNX models from Helsinki-NLP / MarianMT
- Optimized for inference with ONNX Runtime
- Compatible with Hugging Face tokenizers
- Same layout and usage as the original VGT hub
Repository structure
Each model is in its own folder, for example:
Helsinki-NLP-opus-mt-tc-base-bat-zle/
βββ config.json
βββ decoder_model.onnx
βββ decoder_model_merged.onnx
βββ decoder_with_past_model.onnx
βββ encoder_model.onnx
βββ generation_config.json
βββ source.spm
βββ target.spm
βββ special_tokens_map.json
βββ tokenizer_config.json
βββ vocab.json
- encoder_model.onnx β encoder
- decoder_with_past_model.onnx β decoder with KV cache
- decoder_model_merged.onnx β merged decoder (recommended for speed)
- decoder_model.onnx β base decoder
- source.spm / target.spm, vocab.json β tokenizer files
Usage
Dependencies
pip install huggingface_hub onnxruntime transformers sentencepiece
Load and run a model
Replace Helsinki-NLP-opus-mt-tc-base-bat-zle with any model folder name from the repo.
from huggingface_hub import snapshot_download
import onnxruntime as ort
from transformers import MarianTokenizer
import numpy as np
# Download the model folder (use this repo id after renaming to opus-mt-onnx)
repo_id = "aoiandroid/opus-mt-onnx" # or aoiandroid/opus if not renamed yet
model_dir = snapshot_download(
repo_id=repo_id,
allow_patterns="Helsinki-NLP-opus-mt-tc-base-bat-zle/*",
)
# Load tokenizer
tokenizer = MarianTokenizer.from_pretrained(
f"{model_dir}/Helsinki-NLP-opus-mt-tc-base-bat-zle"
)
# Encode input
inputs = tokenizer("Hello, how are you?", return_tensors="np")
# Run encoder
enc = ort.InferenceSession(
f"{model_dir}/Helsinki-NLP-opus-mt-tc-base-bat-zle/encoder_model.onnx"
)
enc_out = enc.run(
None,
{
"input_ids": inputs["input_ids"],
"attention_mask": inputs["attention_mask"],
},
)
# Run merged decoder
dec = ort.InferenceSession(
f"{model_dir}/Helsinki-NLP-opus-mt-tc-base-bat-zle/decoder_model_merged.onnx"
)
decoder_input_ids = np.array([[tokenizer.pad_token_id]], dtype=np.int64)
out = dec.run(
None,
{
"input_ids": decoder_input_ids,
"encoder_hidden_states": enc_out[0],
},
)
Attribution
- Original ONNX hub: VaishalBusiness/opus (VGT ONNX Model Hub)
- Underlying models: Helsinki-NLP OPUS-MT / MarianMT
- ONNX conversions and hosting follow the original projectβs intent; licenses of each model are unchanged.
License
Each model keeps its original license. When using a model, cite the original authors and comply with their license and attribution requirements. This repository only provides a mirror of ONNX conversions.