calculator_model_test

This model is a fine-tuned version of msanocki/calculator_model_test on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0039

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
0.7263	1.0	6	0.5579
0.4040	2.0	12	0.3037
0.2737	3.0	18	0.1986
0.1990	4.0	24	0.1674
0.1645	5.0	30	0.1309
0.1471	6.0	36	0.1474
0.1606	7.0	42	0.1082
0.1205	8.0	48	0.1006
0.1211	9.0	54	0.1031
0.1126	10.0	60	0.0732
0.0970	11.0	66	0.1148
0.1128	12.0	72	0.0826
0.0897	13.0	78	0.0940
0.0881	14.0	84	0.0908
0.0800	15.0	90	0.0572
0.0738	16.0	96	0.0461
0.0721	17.0	102	0.0405
0.0568	18.0	108	0.0459
0.0583	19.0	114	0.0406
0.0520	20.0	120	0.0542
0.0617	21.0	126	0.0396
0.0564	22.0	132	0.0374
0.0592	23.0	138	0.0438
0.0570	24.0	144	0.0459
0.0549	25.0	150	0.0510
0.0570	26.0	156	0.0470
0.0498	27.0	162	0.0304
0.0429	28.0	168	0.0294
0.0391	29.0	174	0.0266
0.0339	30.0	180	0.0219
0.0354	31.0	186	0.0190
0.0293	32.0	192	0.0234
0.0276	33.0	198	0.0167
0.0295	34.0	204	0.0182
0.0228	35.0	210	0.0175
0.0245	36.0	216	0.0189
0.0335	37.0	222	0.0195
0.0377	38.0	228	0.0312
0.0387	39.0	234	0.0355
0.0414	40.0	240	0.0597
0.0527	41.0	246	0.0405
0.0417	42.0	252	0.0187
0.0341	43.0	258	0.0204
0.0339	44.0	264	0.0157
0.0288	45.0	270	0.0178
0.0249	46.0	276	0.0128
0.0237	47.0	282	0.0170
0.0246	48.0	288	0.0141
0.0202	49.0	294	0.0182
0.0262	50.0	300	0.0127
0.0234	51.0	306	0.0141
0.0213	52.0	312	0.0164
0.0184	53.0	318	0.0111
0.0130	54.0	324	0.0112
0.0140	55.0	330	0.0086
0.0144	56.0	336	0.0080
0.0110	57.0	342	0.0087
0.0100	58.0	348	0.0081
0.0095	59.0	354	0.0079
0.0119	60.0	360	0.0075
0.0099	61.0	366	0.0071
0.0081	62.0	372	0.0069
0.0081	63.0	378	0.0061
0.0112	64.0	384	0.0064
0.0087	65.0	390	0.0068
0.0099	66.0	396	0.0069
0.0078	67.0	402	0.0071
0.0096	68.0	408	0.0068
0.0084	69.0	414	0.0064
0.0080	70.0	420	0.0066
0.0068	71.0	426	0.0081
0.0065	72.0	432	0.0056
0.0058	73.0	438	0.0052
0.0049	74.0	444	0.0048
0.0047	75.0	450	0.0048
0.0069	76.0	456	0.0048
0.0109	77.0	462	0.0066
0.0068	78.0	468	0.0062
0.0071	79.0	474	0.0059
0.0059	80.0	480	0.0055
0.0058	81.0	486	0.0053
0.0054	82.0	492	0.0052
0.0049	83.0	498	0.0047
0.0041	84.0	504	0.0048
0.0043	85.0	510	0.0045
0.0042	86.0	516	0.0044
0.0038	87.0	522	0.0044
0.0036	88.0	528	0.0043
0.0039	89.0	534	0.0042
0.0039	90.0	540	0.0044
0.0043	91.0	546	0.0041
0.0038	92.0	552	0.0042
0.0038	93.0	558	0.0042
0.0038	94.0	564	0.0042
0.0034	95.0	570	0.0041
0.0040	96.0	576	0.0041
0.0042	97.0	582	0.0040
0.0038	98.0	588	0.0039
0.0050	99.0	594	0.0039
0.0037	100.0	600	0.0039

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Downloads last month: 5

Safetensors

Model size

7.82M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for msanocki/calculator_model_test

Unable to build the model tree, the base model loops to the model itself. Learn more.