calculator_model_test

This model is a fine-tuned version of pt430187/calculator_model_test on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 60

Training Loss	Epoch	Step	Validation Loss
0.7873	1.0	6	0.7304
0.7842	2.0	12	0.7200
0.7695	3.0	18	0.7135
0.7704	4.0	24	0.7096
0.7592	5.0	30	0.7038
0.7514	6.0	36	0.7007
0.7428	7.0	42	0.6986
0.7328	8.0	48	0.6965
0.7498	9.0	54	0.6941
0.7548	10.0	60	0.6917
0.7523	11.0	66	0.6896
0.7469	12.0	72	0.6866
0.7519	13.0	78	0.6841
0.7429	14.0	84	0.6830
0.7311	15.0	90	0.6804
0.7241	16.0	96	0.6775
0.7400	17.0	102	0.6757
0.7224	18.0	108	0.6745
0.7311	19.0	114	0.6741
0.7377	20.0	120	0.6726
0.7249	21.0	126	0.6703
0.7326	22.0	132	0.6688
0.7181	23.0	138	0.6687
0.7384	24.0	144	0.6673
0.7146	25.0	150	0.6649
0.7232	26.0	156	0.6637
0.7190	27.0	162	0.6619
0.7250	28.0	168	0.6599
0.7236	29.0	174	0.6593
0.7261	30.0	180	0.6607
0.7203	31.0	186	0.6592
0.7278	32.0	192	0.6568
0.7066	33.0	198	0.6555
0.7183	34.0	204	0.6544
0.7074	35.0	210	0.6536
0.7265	36.0	216	0.6534
0.7120	37.0	222	0.6529
0.7215	38.0	228	0.6519
0.7147	39.0	234	0.6518
0.7211	40.0	240	0.6516
0.7143	41.0	246	0.6501
0.7069	42.0	252	0.6486
0.7063	43.0	258	0.6479
0.7090	44.0	264	0.6475
0.7055	45.0	270	0.6470
0.7021	46.0	276	0.6468
0.7142	47.0	282	0.6463
0.7211	48.0	288	0.6456
0.7098	49.0	294	0.6453
0.7150	50.0	300	0.6452
0.7147	51.0	306	0.6452
0.7114	52.0	312	0.6451
0.7076	53.0	318	0.6450
0.7286	54.0	324	0.6447
0.7008	55.0	330	0.6445
0.7004	56.0	336	0.6442
0.7087	57.0	342	0.6441
0.6944	58.0	348	0.6439
0.7045	59.0	354	0.6437
0.7096	60.0	360	0.6437

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Unable to build the model tree, the base model loops to the model itself. Learn more.