Instructions to use thejosango/nuha-multiclass with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use thejosango/nuha-multiclass with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="thejosango/nuha-multiclass")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("thejosango/nuha-multiclass") model = AutoModelForSequenceClassification.from_pretrained("thejosango/nuha-multiclass") - Notebooks
- Google Colab
- Kaggle
nuha-multiclass
Model Summary
nuha-multiclass is an Arabic text classifier that categorises Jordanian social media comments into three classes based on the NUHA methodology for online gender-based violence (OGBV). It fine-tunes nuha-mlm — a domain-adapted Arabic BERT — and outputs one of:
| Label | Meaning |
|---|---|
Not Online Violence |
Comments that are not hate speech |
Offensive Language |
Hate speech characterised by irony or sarcasm |
Gender Based Violence |
Direct hate speech targeting gender — the primary focus of NUHA |
This model was developed as part of a pilot proof-of-concept for the NUHA project by the Jordan Open Source Association (JOSA). It is the production model behind the NUHA analysis platform.
A lightweight, ONNX-optimised 4-layer classifier trained on the same task is available at thejosango/nuha.
For a simpler binary classifier (hate speech / non-hate speech), see nuha-binary.
Uses
Direct Use
Classifying Arabic social media comments for online gender-based violence, particularly for Jordanian Arabic content from Facebook and X (Twitter).
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="thejosango/nuha-multiclass",
tokenizer="thejosango/nuha-multiclass",
)
result = classifier("اخرسي يا غبية")
print(result)
# [{'label': 'Gender Based Violence', 'score': ...}]
For batch inference:
comments = ["يعطيكم العافية", "أنتِ ساحرة", "اخرسي يا غبية"]
results = classifier(comments)
for comment, result in zip(comments, results):
print(f"{result['label']} ({result['score']:.2f}): {comment}")
Using the ONNX Version
For faster CPU inference, use the ONNX export:
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer, pipeline
model = ORTModelForSequenceClassification.from_pretrained("thejosango/nuha")
tokenizer = AutoTokenizer.from_pretrained("thejosango/nuha")
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
Out-of-Scope Use
- Other Arabic dialects: The model was trained primarily on Jordanian Arabic. Performance on Egyptian, Gulf, or Modern Standard Arabic is not validated.
- Other hate speech targets: NUHA is calibrated for online gender-based violence. It is not designed to detect hate speech targeting race, religion, or other demographics.
- High-stakes automated decisions: Given the moderate performance (F1 ≈ 0.54) and pilot nature of this work, the model should not be used as the sole decision-maker in content moderation systems without human review.
Bias, Risks, and Limitations
- Pilot annotation quality: Training labels were produced in an exploratory annotation effort with variable inter-annotator agreement. The model inherits noise from that process, which is reflected in the moderate F1 score.
- Three-class difficulty: Distinguishing
Offensive LanguagefromGender Based Violenceis a genuinely difficult subtask. TheOffensive Languageclass is small (≈2% of training data) and the model may struggle with it. - Colloquial Arabic only: The aggressive text cleaning (Arabic-only filtering) means the model has never seen URLs, numbers, punctuation, or Latin-script text.
- Imbalanced classes: The training data is dominated by
Not Online Violence(≈59%), withOffensive Languagebeing very sparse (≈2%). Data augmentation was applied but class imbalance remains a factor.
Training Details
Training Data
Fine-tuned on the methodology configuration of thejosango/nuha-dataset, which applies the three-class NUHA categorisation scheme to the original annotations.
Preprocessing
At training and inference time, the following normalisation is applied to input text (in addition to the dataset-level Arabic-only filtering):
- URLs replaced with
[رابط]token - @mentions replaced with
[مستخدم]token - Email addresses replaced with
[بريد]token - Numbers removed
- Punctuation removed
- Arabic diacritics (harakat) removed
- Whitespace normalised
Hyperparameters
| Parameter | Value |
|---|---|
| Base model | thejosango/nuha-mlm |
| Hidden layers | 12 (full depth) |
| Learning rate | 5e-5 |
| LR schedule | Constant |
| Batch size | 64 |
| Epochs | 5 |
| Weight decay | 0.0 |
| Label smoothing | 0.1 |
| Weighted loss | No |
| Data augmentation | Yes (contextual word substitution, ratio 0.75) |
| Framework | Transformers 4.32.1, PyTorch 2.0.1 |
Evaluation Results
Evaluated on the validation split of thejosango/nuha-dataset (methodology configuration):
| Metric | Value |
|---|---|
| F1 (macro) | 0.5363 |
| Precision | 0.6660 |
| Recall | 0.5188 |
| Loss | 0.7126 |
The lower recall relative to precision suggests the model is conservative — it tends to under-predict Gender Based Violence rather than over-predict it. This reflects both the difficulty of the three-class task and the limited size of the pilot training corpus.
This model was developed as part of an initial pilot study. Performance metrics reflect the complexity of the task and the proof-of-concept nature of this system.
- Downloads last month
- 11
Model tree for thejosango/nuha-multiclass
Base model
thejosango/nuha-mlmDataset used to train thejosango/nuha-multiclass
Evaluation results
- F1 on Jordanian NUHA Datasetvalidation set self-reported0.536
- Precision on Jordanian NUHA Datasetvalidation set self-reported0.666
- Recall on Jordanian NUHA Datasetvalidation set self-reported0.519