GliteTech
/

DisamBertCrossEncoder-base

Generated from Trainer

Model card Files Files and versions

DisamBertCrossEncoder-base / README.md

PeteBleackley's picture

End of training

8ae4e06 verified about 12 hours ago

|

3.06 kB

	---
	library_name: transformers
	language:
	- en
	license: apache-2.0
	base_model: answerdotai/ModernBERT-base
	tags:
	- generated_from_trainer
	metrics:
	- precision
	- recall
	- f1
	- accuracy
	- matthews_correlation
	model-index:
	- name: DisamBertCrossEncoder-base
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# DisamBertCrossEncoder-base

	This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.9841
	- Precision: 0.6896
	- Recall: 0.6396
	- F1: 0.6636
	- Accuracy: 0.9412
	- Matthews Correlation: 0.6320

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- gradient_accumulation_steps: 5
	- total_train_batch_size: 320
	- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: cosine
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Precision \| Recall \| F1 \| Accuracy \| Matthews Correlation \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:---------:\|:------:\|:------:\|:--------:\|:--------------------:\|
	\| No log \| 0 \| 0 \| 430.2531 \| 0.0905 \| 0.9978 \| 0.1660 \| 0.0911 \| -0.0157 \|
	\| 0.0672 \| 1.0 \| 12551 \| 0.1555 \| 0.6786 \| 0.5846 \| 0.6281 \| 0.9372 \| 0.5960 \|
	\| 0.0550 \| 2.0 \| 25102 \| 0.1447 \| 0.7176 \| 0.6813 \| 0.6990 \| 0.9468 \| 0.6701 \|
	\| 0.0427 \| 3.0 \| 37653 \| 0.1498 \| 0.7690 \| 0.6440 \| 0.7010 \| 0.9502 \| 0.6772 \|
	\| 0.0309 \| 4.0 \| 50204 \| 0.1779 \| 0.6773 \| 0.7011 \| 0.6890 \| 0.9426 \| 0.6575 \|
	\| 0.0179 \| 5.0 \| 62755 \| 0.2554 \| 0.7021 \| 0.6681 \| 0.6847 \| 0.9442 \| 0.6543 \|
	\| 0.0092 \| 6.0 \| 75306 \| 0.3257 \| 0.6927 \| 0.6637 \| 0.6779 \| 0.9428 \| 0.6467 \|
	\| 0.0047 \| 7.0 \| 87857 \| 0.4757 \| 0.6674 \| 0.6791 \| 0.6732 \| 0.9402 \| 0.6403 \|
	\| 0.0022 \| 8.0 \| 100408 \| 0.6664 \| 0.6943 \| 0.6440 \| 0.6682 \| 0.9420 \| 0.6370 \|
	\| 0.0011 \| 9.0 \| 112959 \| 0.8230 \| 0.6872 \| 0.6374 \| 0.6613 \| 0.9408 \| 0.6295 \|
	\| 0.0009 \| 10.0 \| 125510 \| 0.9841 \| 0.6896 \| 0.6396 \| 0.6636 \| 0.9412 \| 0.6320 \|


	### Framework versions

	- Transformers 5.3.0
	- Pytorch 2.10.0+cu128
	- Datasets 4.5.0
	- Tokenizers 0.22.2