lejelly's picture
Upload merged model (lambda_a=0.00, lambda_b=1.00)
539116d verified
metadata
tags:
  - model-merge
  - hermite-interpolation
  - deepseek
base_model:
  - deepseek-ai/deepseek-math-7b-instruct
  - deepseek-ai/deepseek-coder-7b-instruct-v1.5

deepseek-7b-math-code-lambda000

2モデルの線形補間マージモデル。

Merge Configuration

Parameter Value
Model A deepseek-ai/deepseek-math-7b-instruct
Model B deepseek-ai/deepseek-coder-7b-instruct-v1.5
λ_a 0.00
λ_b 1.00
Formula θ* = 0.00 × θ_a + 1.00 × θ_b
dtype torch.float16

Tokenizer

Union tokenizer (mergekit-style): vocabularies of both models are merged.

  • Union vocab size: 100016
  • Tokens added from Model B: 14
  • Tokens only in Model A: 0

For tokens missing from a model, the other model's embedding is used as fallback before linear interpolation.

Description

This model was created by linearly interpolating the parameters of two models:

  • Model A (deepseek-ai/deepseek-math-7b-instruct): weight = 0.00
  • Model B (deepseek-ai/deepseek-coder-7b-instruct-v1.5): weight = 1.00