|
|
--- |
|
|
language: fa |
|
|
pipeline_tag: token-classification |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# QomSSLab/Verdict_Splitter |
|
|
|
|
|
This repository hosts an XLM-RoBERTa token-classification head trained. |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline |
|
|
|
|
|
model_id = "QomSSLab/Verdict_Splitter" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForTokenClassification.from_pretrained(model_id) |
|
|
tagger = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple") |
|
|
|
|
|
text = "مثال از یک ورودی فارسی" |
|
|
for entity in tagger(text): |
|
|
print(entity) |
|
|
``` |
|
|
|
|
|
## Labels |
|
|
|
|
|
- `O` |
|
|
- `استدلال` |
|
|
- `تصمیم` |
|
|
- `خارج` |
|
|
- `خلع` |
|
|
- `مقدمه` |
|
|
- `پایانی` |
|
|
|
|
|
## Metrics |
|
|
|
|
|
## Validation Metrics |
|
|
|
|
|
- Precision: 0.7430 |
|
|
- Recall: 0.8457 |
|
|
- F1: 0.7910 |
|
|
- Accuracy: 0.9545 |
|
|
|
|
|
### Per-label Breakdown |
|
|
|
|
|
| Label | Precision | Recall | F1 | Support | |
|
|
| --- | --- | --- | --- | --- | |
|
|
| O | 0.8468 | 0.7995 | 0.8225 | 394 | |
|
|
| استدلال | 0.9754 | 0.8776 | 0.9239 | 6635 | |
|
|
| تصمیم | 0.9917 | 0.9608 | 0.9760 | 5361 | |
|
|
| خارج | 1.0000 | 1.0000 | 1.0000 | 0 | |
|
|
| خلع | 1.0000 | 1.0000 | 1.0000 | 0 | |
|
|
| مقدمه | 0.9279 | 0.9982 | 0.9618 | 10871 | |
|
|
| پایانی | 0.9728 | 0.9902 | 0.9814 | 1732 | |
|
|
|
|
|
|
|
|
|