Instructions to use urduhack/roberta-urdu-small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use urduhack/roberta-urdu-small with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="urduhack/roberta-urdu-small")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("urduhack/roberta-urdu-small") model = AutoModelForMaskedLM.from_pretrained("urduhack/roberta-urdu-small") - Notebooks
- Google Colab
- Kaggle
# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("urduhack/roberta-urdu-small")
model = AutoModelForMaskedLM.from_pretrained("urduhack/roberta-urdu-small")Quick Links
roberta-urdu-small
Overview
Language model: roberta-urdu-small Model size: 125M Language: Urdu Training data: News data from urdu news resources in Pakistan
About roberta-urdu-small
roberta-urdu-small is a language model for urdu language.
from transformers import pipeline
fill_mask = pipeline("fill-mask", model="urduhack/roberta-urdu-small", tokenizer="urduhack/roberta-urdu-small")
Training procedure
roberta-urdu-small was trained on urdu news corpus. Training data was normalized using normalization module from urduhack to eliminate characters from other languages like arabic.
About Urduhack
Urduhack is a Natural Language Processing (NLP) library for urdu language. Github: https://github.com/urduhack/urduhack
- Downloads last month
- 727
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="urduhack/roberta-urdu-small")