calculator_model_test

This model is a fine-tuned version of msanocki/calculator_model_test on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0039

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
0.7263 1.0 6 0.5579
0.4040 2.0 12 0.3037
0.2737 3.0 18 0.1986
0.1990 4.0 24 0.1674
0.1645 5.0 30 0.1309
0.1471 6.0 36 0.1474
0.1606 7.0 42 0.1082
0.1205 8.0 48 0.1006
0.1211 9.0 54 0.1031
0.1126 10.0 60 0.0732
0.0970 11.0 66 0.1148
0.1128 12.0 72 0.0826
0.0897 13.0 78 0.0940
0.0881 14.0 84 0.0908
0.0800 15.0 90 0.0572
0.0738 16.0 96 0.0461
0.0721 17.0 102 0.0405
0.0568 18.0 108 0.0459
0.0583 19.0 114 0.0406
0.0520 20.0 120 0.0542
0.0617 21.0 126 0.0396
0.0564 22.0 132 0.0374
0.0592 23.0 138 0.0438
0.0570 24.0 144 0.0459
0.0549 25.0 150 0.0510
0.0570 26.0 156 0.0470
0.0498 27.0 162 0.0304
0.0429 28.0 168 0.0294
0.0391 29.0 174 0.0266
0.0339 30.0 180 0.0219
0.0354 31.0 186 0.0190
0.0293 32.0 192 0.0234
0.0276 33.0 198 0.0167
0.0295 34.0 204 0.0182
0.0228 35.0 210 0.0175
0.0245 36.0 216 0.0189
0.0335 37.0 222 0.0195
0.0377 38.0 228 0.0312
0.0387 39.0 234 0.0355
0.0414 40.0 240 0.0597
0.0527 41.0 246 0.0405
0.0417 42.0 252 0.0187
0.0341 43.0 258 0.0204
0.0339 44.0 264 0.0157
0.0288 45.0 270 0.0178
0.0249 46.0 276 0.0128
0.0237 47.0 282 0.0170
0.0246 48.0 288 0.0141
0.0202 49.0 294 0.0182
0.0262 50.0 300 0.0127
0.0234 51.0 306 0.0141
0.0213 52.0 312 0.0164
0.0184 53.0 318 0.0111
0.0130 54.0 324 0.0112
0.0140 55.0 330 0.0086
0.0144 56.0 336 0.0080
0.0110 57.0 342 0.0087
0.0100 58.0 348 0.0081
0.0095 59.0 354 0.0079
0.0119 60.0 360 0.0075
0.0099 61.0 366 0.0071
0.0081 62.0 372 0.0069
0.0081 63.0 378 0.0061
0.0112 64.0 384 0.0064
0.0087 65.0 390 0.0068
0.0099 66.0 396 0.0069
0.0078 67.0 402 0.0071
0.0096 68.0 408 0.0068
0.0084 69.0 414 0.0064
0.0080 70.0 420 0.0066
0.0068 71.0 426 0.0081
0.0065 72.0 432 0.0056
0.0058 73.0 438 0.0052
0.0049 74.0 444 0.0048
0.0047 75.0 450 0.0048
0.0069 76.0 456 0.0048
0.0109 77.0 462 0.0066
0.0068 78.0 468 0.0062
0.0071 79.0 474 0.0059
0.0059 80.0 480 0.0055
0.0058 81.0 486 0.0053
0.0054 82.0 492 0.0052
0.0049 83.0 498 0.0047
0.0041 84.0 504 0.0048
0.0043 85.0 510 0.0045
0.0042 86.0 516 0.0044
0.0038 87.0 522 0.0044
0.0036 88.0 528 0.0043
0.0039 89.0 534 0.0042
0.0039 90.0 540 0.0044
0.0043 91.0 546 0.0041
0.0038 92.0 552 0.0042
0.0038 93.0 558 0.0042
0.0038 94.0 564 0.0042
0.0034 95.0 570 0.0041
0.0040 96.0 576 0.0041
0.0042 97.0 582 0.0040
0.0038 98.0 588 0.0039
0.0050 99.0 594 0.0039
0.0037 100.0 600 0.0039

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
165
Safetensors
Model size
7.82M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for msanocki/calculator_model_test

Unable to build the model tree, the base model loops to the model itself. Learn more.