Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Paper • 2203.05482 • Published • 8
This is a merge of pre-trained language models created using mergekit.
This model was merged using the Linear merge method using google/gemma-4-E2B-it as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: google/gemma-4-E2B-it
parameters:
density: 0.5
weight: 0.32
- model: WWTCyberLab/gemma-4-E2B-it-abliterated
parameters:
density: 0.5
weight: 0.68
merge_method: linear
base_model: google/gemma-4-E2B-it
parameters:
normalize: true
dtype: bfloat16
tokenizer_source: base