File size: 1,064 Bytes
52cd084
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
tags:
- model-merge
- hermite-interpolation
- deepseek
base_model:
- deepseek-ai/deepseek-math-7b-instruct
- deepseek-ai/deepseek-coder-7b-instruct-v1.5
---

# deepseek-7b-math-code-lambda075

2モデルの線形補間マージモデル。

## Merge Configuration

| Parameter | Value |
|-----------|-------|
| Model A | `deepseek-ai/deepseek-math-7b-instruct` |
| Model B | `deepseek-ai/deepseek-coder-7b-instruct-v1.5` |
| λ_a | 0.75 |
| λ_b | 0.25 |
| Formula | θ* = 0.75 × θ_a + 0.25 × θ_b |
| dtype | torch.float16 |

## Tokenizer

Union tokenizer (mergekit-style): vocabularies of both models are merged.
- Union vocab size: 100016
- Tokens added from Model B: 14
- Tokens only in Model A: 0

For tokens missing from a model, the other model's embedding is used as fallback
before linear interpolation.

## Description

This model was created by linearly interpolating the parameters of two models:
- **Model A** (`deepseek-ai/deepseek-math-7b-instruct`): weight = 0.75
- **Model B** (`deepseek-ai/deepseek-coder-7b-instruct-v1.5`): weight = 0.25