multilingual-e5-small Document Type V1 Classifier

A fine-tuned version of the bert architecture (BertForSequenceClassification) optimized for the text-classification task.

  • Model type: bert
  • Problem Type: single_label_classification
  • Number of Labels: 17
  • Vocabulary Size: 250037
  • License: MIT

Use

To get started with this model in Python using the Hugging Face Transformers library, run the following code:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "agentlans/multilingual-e5-small-doc-type-v1-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

text = "Replace this with your input text."
inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits

predicted_class_id = logits.argmax().item()
predicted_class_name = model.config.id2label[predicted_class_id]

print(f"Predicted Class ID: {predicted_class_id}")
print(f"Predicted Class Name: {predicted_class_name}")

Intended Uses & Limitations

Intended Use

This model is designed for sequence classification tasks. Below are the specific class labels mapped to their corresponding IDs:

Label ID Label Name
0 Academic/Research
1 Adult
2 Code
3 E-Commerce
4 Government
5 Legal
6 Literary
7 Machine-Generated
8 Media
9 News/Editorial
10 Other
11 Personal
12 Promotional
13 Reference
14 Reviews
15 Search
16 Social

Training Details

Hyperparameters

The following hyperparameters were used during fine-tuning:

  • Learning Rate: 5e-05
  • Train Batch Size: 8
  • Eval Batch Size: 8
  • Optimizer: OptimizerNames.ADAMW_TORCH_FUSED
  • Number of Epochs: 3.0
  • Mixed Precision: BF16
Show Advanced Training Configuration

Optimization & Regularization

  • Gradient Accumulation Steps: 1
  • Learning Rate Scheduler: SchedulerType.LINEAR
  • Warmup Steps: 0
  • Warmup Ratio: None
  • Weight Decay: 0.0
  • Max Gradient Norm: 1.0

Hardware & Reproducibility

  • Number of GPUs: 1
  • Seed: 42

Training Results & Evaluation

During fine-tuning, the model achieved the following results on the evaluation set:

Metric Value
Train Loss 0.3726
Validation Loss 0.6096
Validation F1 Score 0.8794
Total FLOPs 7.9063e+15

Speed Performance

  • Training Runtime: 1624.1756 seconds
  • Train Samples per Second: 295.512
  • Evaluation Runtime: 10.6093 seconds
  • Eval Samples per Second: 1886.082
Show Detailed Training Logs

Training Logs History

Step Epoch Learning Rate Training Loss Validation Loss Validation F1
500 0.025 4.9584e-05 1.2379 N/A N/A
1000 0.05 4.9167e-05 0.8651 N/A N/A
1500 0.075 4.8751e-05 0.7379 N/A N/A
2000 0.1 4.8334e-05 0.7292 N/A N/A
2500 0.125 4.7917e-05 0.696 N/A N/A
3000 0.15 4.7501e-05 0.711 N/A N/A
3500 0.175 4.7084e-05 0.6598 N/A N/A
4000 0.2 4.6667e-05 0.6057 N/A N/A
4500 0.225 4.6251e-05 0.585 N/A N/A
5000 0.25 4.5834e-05 0.5894 N/A N/A
5500 0.275 4.5417e-05 0.5759 N/A N/A
6000 0.3 4.5001e-05 0.5605 N/A N/A
6500 0.325 4.4584e-05 0.5548 N/A N/A
7000 0.35 4.4167e-05 0.5508 N/A N/A
7500 0.375 4.3751e-05 0.5182 N/A N/A
8000 0.4 4.3334e-05 0.5597 N/A N/A
8500 0.425 4.2917e-05 0.5342 N/A N/A
9000 0.45 4.2500e-05 0.5154 N/A N/A
9500 0.475 4.2084e-05 0.5101 N/A N/A
10000 0.5 4.1667e-05 0.5153 N/A N/A
10500 0.525 4.1250e-05 0.4962 N/A N/A
11000 0.55 4.0834e-05 0.5055 N/A N/A
11500 0.575 4.0417e-05 0.5289 N/A N/A
12000 0.6 4.0000e-05 0.5024 N/A N/A
12500 0.625 3.9584e-05 0.481 N/A N/A
13000 0.65 3.9167e-05 0.4843 N/A N/A
13500 0.675 3.8750e-05 0.4519 N/A N/A
14000 0.7 3.8334e-05 0.4829 N/A N/A
14500 0.725 3.7917e-05 0.4746 N/A N/A
15000 0.75 3.7500e-05 0.5123 N/A N/A
15500 0.775 3.7084e-05 0.5058 N/A N/A
16000 0.8 3.6667e-05 0.453 N/A N/A
16500 0.825 3.6250e-05 0.4604 N/A N/A
17000 0.85 3.5833e-05 0.4689 N/A N/A
17500 0.875 3.5417e-05 0.4689 N/A N/A
18000 0.9 3.5000e-05 0.4704 N/A N/A
18500 0.925 3.4583e-05 0.4367 N/A N/A
19000 0.95 3.4167e-05 0.451 N/A N/A
19500 0.975 3.3750e-05 0.4538 N/A N/A
19999 1.0 N/A N/A 0.4387 0.8656
20000 1.0 3.3333e-05 0.4367 N/A N/A
20500 1.025 3.2917e-05 0.3614 N/A N/A
21000 1.05 3.2500e-05 0.3757 N/A N/A
21500 1.075 3.2083e-05 0.3197 N/A N/A
22000 1.1 3.1667e-05 0.3649 N/A N/A
22500 1.125 3.1250e-05 0.3736 N/A N/A
23000 1.15 3.0833e-05 0.3325 N/A N/A
23500 1.175 3.0417e-05 0.3472 N/A N/A
24000 1.2 3.0000e-05 0.3513 N/A N/A
24500 1.225 2.9583e-05 0.3699 N/A N/A
25000 1.25 2.9166e-05 0.3847 N/A N/A
25500 1.275 2.8750e-05 0.3252 N/A N/A
26000 1.3 2.8333e-05 0.3573 N/A N/A
26500 1.325 2.7916e-05 0.3704 N/A N/A
27000 1.35 2.7500e-05 0.3269 N/A N/A
27500 1.375 2.7083e-05 0.3637 N/A N/A
28000 1.4 2.6666e-05 0.3503 N/A N/A
28500 1.425 2.6250e-05 0.3503 N/A N/A
29000 1.45 2.5833e-05 0.3246 N/A N/A
29500 1.475 2.5416e-05 0.3507 N/A N/A
30000 1.5 2.5000e-05 0.3274 N/A N/A
30500 1.525 2.4583e-05 0.3926 N/A N/A
31000 1.55 2.4166e-05 0.3445 N/A N/A
31500 1.575 2.3750e-05 0.3397 N/A N/A
32000 1.6 2.3333e-05 0.3337 N/A N/A
32500 1.625 2.2916e-05 0.3398 N/A N/A
33000 1.65 2.2499e-05 0.3457 N/A N/A
33500 1.675 2.2083e-05 0.3252 N/A N/A
34000 1.7 2.1666e-05 0.3691 N/A N/A
34500 1.725 2.1249e-05 0.3334 N/A N/A
35000 1.75 2.0833e-05 0.3363 N/A N/A
35500 1.775 2.0416e-05 0.3454 N/A N/A
36000 1.8 1.9999e-05 0.3189 N/A N/A
36500 1.825 1.9583e-05 0.3422 N/A N/A
37000 1.85 1.9166e-05 0.3355 N/A N/A
37500 1.875 1.8749e-05 0.3195 N/A N/A
38000 1.9 1.8333e-05 0.2937 N/A N/A
38500 1.925 1.7916e-05 0.3382 N/A N/A
39000 1.95 1.7499e-05 0.3509 N/A N/A
39500 1.975 1.7083e-05 0.3244 N/A N/A
39998 2.0 N/A N/A 0.515 0.8739
40000 2.0 1.6666e-05 0.3325 N/A N/A
40500 2.025 1.6249e-05 0.2202 N/A N/A
41000 2.05 1.5832e-05 0.2126 N/A N/A
41500 2.075 1.5416e-05 0.1978 N/A N/A
42000 2.1 1.4999e-05 0.2235 N/A N/A
42500 2.125 1.4582e-05 0.2285 N/A N/A
43000 2.15 1.4166e-05 0.2114 N/A N/A
43500 2.175 1.3749e-05 0.2401 N/A N/A
44000 2.2 1.3332e-05 0.2316 N/A N/A
44500 2.225 1.2916e-05 0.2356 N/A N/A
45000 2.25 1.2499e-05 0.2265 N/A N/A
45500 2.275 1.2082e-05 0.2156 N/A N/A
46000 2.3 1.1666e-05 0.1985 N/A N/A
46500 2.325 1.1249e-05 0.2341 N/A N/A
47000 2.35 1.0832e-05 0.2253 N/A N/A
47500 2.375 1.0416e-05 0.2155 N/A N/A
48000 2.4 9.9988e-06 0.1964 N/A N/A
48500 2.425 9.5821e-06 0.2406 N/A N/A
49000 2.45 9.1655e-06 0.2345 N/A N/A
49500 2.475 8.7488e-06 0.2179 N/A N/A
50000 2.5 8.3321e-06 0.2076 N/A N/A
50500 2.525 7.9154e-06 0.2387 N/A N/A
51000 2.55 7.4987e-06 0.2114 N/A N/A
51500 2.575 7.0820e-06 0.1916 N/A N/A
52000 2.6 6.6653e-06 0.2074 N/A N/A
52500 2.625 6.2486e-06 0.2133 N/A N/A
53000 2.65 5.8320e-06 0.2301 N/A N/A
53500 2.675 5.4153e-06 0.2216 N/A N/A
54000 2.7 4.9986e-06 0.2313 N/A N/A
54500 2.725 4.5819e-06 0.1916 N/A N/A
55000 2.75 4.1652e-06 0.2055 N/A N/A
55500 2.775 3.7485e-06 0.2059 N/A N/A
56000 2.8 3.3318e-06 0.2021 N/A N/A
56500 2.825 2.9151e-06 0.2075 N/A N/A
57000 2.85 2.4985e-06 0.1644 N/A N/A
57500 2.875 2.0818e-06 0.2023 N/A N/A
58000 2.9 1.6651e-06 0.2175 N/A N/A
58500 2.925 1.2484e-06 0.2073 N/A N/A
59000 2.95 8.3171e-07 0.2154 N/A N/A
59500 2.975 4.1502e-07 0.2132 N/A N/A
59997 3.0 N/A N/A 0.6096 0.8794

Framework Versions

  • Transformers: 5.0.0.dev0
  • PyTorch: 2.9.1+cu128
Downloads last month
22
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for agentlans/multilingual-e5-small-doc-type-v1-classifier

Finetuned
(153)
this model

Dataset used to train agentlans/multilingual-e5-small-doc-type-v1-classifier

Evaluation results