| | --- |
| | base_model: |
| | - appvoid/arco |
| | - h2oai/h2o-danube3-500m-base |
| | library_name: transformers |
| | tags: |
| | - mergekit |
| | - merge |
| |
|
| | --- |
| | # arco+ |
| |
|
| | This is an untrained passthrough model based on arco and danube as a first effort to train a small enough reasoning language model that generalizes across all kind of reasoning tasks. |
| |
|
| | #### Benchmarks |
| |
|
| | | Parameters | Model | MMLU | ARC | HellaSwag | PIQA | Winogrande | Average | |
| | | -----------|--------------------------------|-------|-------|-----------|--------|------------|---------| |
| | | 488m | arco-lite | **23.22** | 33.45 | 56.55| 69.70 | **59.19**| 48.46 | |
| | | 773m | arco-plus | 23.06 | **36.43** | **60.09**|**72.36**| **60.46**| **50.48** | |
| |
|
| | #### Configuration |
| |
|
| | The following YAML configuration was used to produce this model: |
| |
|
| | ```yaml |
| | slices: |
| | - sources: |
| | - model: appvoid/arco |
| | layer_range: [0, 14] |
| | - sources: |
| | - model: h2oai/h2o-danube3-500m-base |
| | layer_range: [4, 16] |
| | |
| | merge_method: passthrough |
| | dtype: float16 |
| | |
| | ``` |
| |
|