AraGPT2: Pre-Trained Transformer for Arabic Language Generation
Paper • 2012.15520 • Published
How to use aubmindlab/aragpt2-mega-detector-long with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="aubmindlab/aragpt2-mega-detector-long") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("aubmindlab/aragpt2-mega-detector-long")
model = AutoModelForSequenceClassification.from_pretrained("aubmindlab/aragpt2-mega-detector-long")Machine generated detector model from the AraGPT2: Pre-Trained Transformer for Arabic Language Generation paper
This model is trained on the long text passages, and achieves a 99.4% F1-Score.
from transformers import pipeline
from arabert.preprocess import ArabertPreprocessor
processor = ArabertPreprocessor(model="aubmindlab/araelectra-base-discriminator")
pipe = pipeline("sentiment-analysis", model = "aubmindlab/aragpt2-mega-detector-long")
text = " "
text_prep = processor.preprocess(text)
result = pipe(text_prep)
# [{'label': 'machine-generated', 'score': 0.9977743625640869}]
@misc{antoun2020aragpt2,
title={AraGPT2: Pre-Trained Transformer for Arabic Language Generation},
author={Wissam Antoun and Fady Baly and Hazem Hajj},
year={2020},
eprint={2012.15520},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Wissam Antoun: Linkedin | Twitter | Github | wfa07@mail.aub.edu | wissam.antoun@gmail.com
Fady Baly: Linkedin | Twitter | Github | fgb06@mail.aub.edu | baly.fady@gmail.com