| | --- |
| | language: |
| | - fa |
| | metrics: |
| | - f1 |
| | - accuracy |
| | - precision |
| | - recall |
| | base_model: |
| | - sbunlp/fabert |
| | pipeline_tag: text-classification |
| | tags: |
| | - code |
| | --- |
| | # **Fine-Tuned FaBERT Model for Formality Classification** |
| |
|
| | This repository contains a fine-tuned version of **FABERT**, a pre-trained language model designed for **formality classification**. This model has been specifically trained to classify text as **formal** or **informal**, making it ideal for applications in content moderation, social media monitoring, and customer support automation. |
| |
|
| | ## **Model Overview** |
| | - **Architecture:** Built on the **FABERT** model, a transformer-based architecture optimized for NLP tasks. |
| | - **Task:** **Formality Classification** – distinguishing between formal and informal language in text. |
| | - **Fine-Tuning:** The model has been fine-tuned on a custom dataset containing a variety of formal and informal text. |
| |
|
| | ## **Key Features** |
| | - **Multilingual Support:** This model is capable of classifying text in multiple languages, ensuring robustness in diverse linguistic contexts. |
| | - **High Performance:** Fine-tuned to provide accurate predictions for formal vs. informal text classification. |
| | - **Efficient for Deployment:** Optimized for real-time use in environments like social media platforms, content moderation tools, and communication systems. |
| |
|
| | ## **How to Use the Model** |
| |
|
| | You can use this model in your Python code with the Hugging Face `transformers` library and PyTorch. The following code snippet demonstrates how to tokenize text, make predictions, and classify whether the text is formal or informal. |
| |
|
| | ```python |
| | import torch |
| | from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| | |
| | # Load the pre-trained tokenizer and model |
| | tokenizer = AutoTokenizer.from_pretrained("faimlab/fabert_formality_classifier") |
| | model = AutoModelForSequenceClassification.from_pretrained("faimlab/fabert_formality_classifier") |
| | |
| | # Ensure the model runs on GPU if available |
| | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| | model.to(device) |
| | |
| | # Example input text |
| | input_text = "Please find attached the report for your review." |
| | |
| | # Tokenize the input |
| | inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True, max_length=512) |
| | |
| | # Move the model and input to GPU if available |
| | inputs = {key: value.to(device) for key, value in inputs.items()} |
| | |
| | # Make predictions |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | logits = outputs.logits |
| | |
| | # Get the predicted label |
| | predicted_label = logits.argmax(dim=1).item() |
| | print(f"Predicted Label: {predicted_label}") |