How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("fill-mask", model="HeNLP/HeRo")
# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("HeNLP/HeRo")
model = AutoModelForMaskedLM.from_pretrained("HeNLP/HeRo")
Quick Links

Hebrew Language Model

State-of-the-art RoBERTa language model for Hebrew.

How to use

from transformers import AutoModelForMaskedLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('HeNLP/HeRo')
model = AutoModelForMaskedLM.from_pretrained('HeNLP/HeRo'

# Tokenization Example:
# Tokenizing
tokenized_string = tokenizer('שלום לכולם')

# Decoding 
decoded_string = tokenizer.decode(tokenized_string ['input_ids'], skip_special_tokens=True)

Citing

If you use HeRo in your research, please cite HeRo: RoBERTa and Longformer Hebrew Language Models.

@article{shalumov2023hero,
      title={HeRo: RoBERTa and Longformer Hebrew Language Models}, 
      author={Vitaly Shalumov and Harel Haskey},
      year={2023},
      journal={arXiv:2304.11077},
}
Downloads last month
28
Safetensors
Model size
0.1B params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for HeNLP/HeRo

Finetunes
1 model

Dataset used to train HeNLP/HeRo

Paper for HeNLP/HeRo