--- tags: - model-merge - hermite-interpolation - deepseek base_model: - deepseek-ai/deepseek-math-7b-instruct - deepseek-ai/deepseek-coder-7b-instruct-v1.5 --- # deepseek-7b-math-code-lambda025 2モデルの線形補間マージモデル。 ## Merge Configuration | Parameter | Value | |-----------|-------| | Model A | `deepseek-ai/deepseek-math-7b-instruct` | | Model B | `deepseek-ai/deepseek-coder-7b-instruct-v1.5` | | λ_a | 0.25 | | λ_b | 0.75 | | Formula | θ* = 0.25 × θ_a + 0.75 × θ_b | | dtype | torch.float16 | ## Tokenizer Union tokenizer (mergekit-style): vocabularies of both models are merged. - Union vocab size: 100016 - Tokens added from Model B: 14 - Tokens only in Model A: 0 For tokens missing from a model, the other model's embedding is used as fallback before linear interpolation. ## Description This model was created by linearly interpolating the parameters of two models: - **Model A** (`deepseek-ai/deepseek-math-7b-instruct`): weight = 0.25 - **Model B** (`deepseek-ai/deepseek-coder-7b-instruct-v1.5`): weight = 0.75