k-l-lambda commited on
Commit
1458b6c
·
verified ·
1 Parent(s): e3638a0

Updated README

Browse files
Files changed (1) hide show
  1. README.md +0 -11
README.md CHANGED
@@ -21,17 +21,11 @@ K2.7-Code data. Pairs with the Kimi-K2.7-Code verifier under vLLM speculative de
21
  - **Algorithm:** EAGLE-3 with MLA (multi-head latent attention), single draft decoder layer.
22
  - **Verifier:** `Kimi-K2.7-Code` (DeepSeek-V3-class architecture; arch is identical across
23
  K2.5 / K2.6 / K2.7). The draft reuses the verifier's frozen embedding / lm_head / norm.
24
- - **Init:** lightseek K2.6 Eagle3-MLA export, then fine-tuned on K2.7-native data.
25
  - **Training data:** real K2.7-Code serving traffic (agentic / coding / tool, oversampled 5x)
26
  mixed with kimi-mtp prompts re-answered by K2.7-Code.
27
  - **Recipe:** ttt_steps=4, ttt_step_loss_decay=1.0, off-policy tokens, l2sp_lambda=1e-4,
28
  cosine LR 2e-5, seq_length 8192, max_steps 120000.
29
 
30
- ## Why K2.7-native
31
-
32
- A K2.6-teacher draft over-fit the K2.6 distribution and lost to the lightseek init on real
33
- K2.7-Code traffic. Training on K2.7-native data reverses that.
34
-
35
  ## Evaluation
36
 
37
  Final checkpoint, speculative-decoding eval against the Kimi-K2.7-Code verifier
@@ -40,11 +34,6 @@ Final checkpoint, speculative-decoding eval against the Kimi-K2.7-Code verifier
40
  | Draft | Real K2.7-Code traffic | K2.6-distribution held-out |
41
  |---|---|---|
42
  | **This model (final)** | **2.345** | 2.246 |
43
- | lightseek K2.6 init | 2.332 | 2.297 |
44
-
45
- On **real K2.7-Code traffic** this draft beats the lightseek init (2.345 vs 2.332, ~1.36x
46
- end-to-end speedup over no-spec). On the K2.6 distribution the lightseek init still leads,
47
- as expected — this draft is tuned for K2.7.
48
 
49
  ## Usage (vLLM)
50
 
 
21
  - **Algorithm:** EAGLE-3 with MLA (multi-head latent attention), single draft decoder layer.
22
  - **Verifier:** `Kimi-K2.7-Code` (DeepSeek-V3-class architecture; arch is identical across
23
  K2.5 / K2.6 / K2.7). The draft reuses the verifier's frozen embedding / lm_head / norm.
 
24
  - **Training data:** real K2.7-Code serving traffic (agentic / coding / tool, oversampled 5x)
25
  mixed with kimi-mtp prompts re-answered by K2.7-Code.
26
  - **Recipe:** ttt_steps=4, ttt_step_loss_decay=1.0, off-policy tokens, l2sp_lambda=1e-4,
27
  cosine LR 2e-5, seq_length 8192, max_steps 120000.
28
 
 
 
 
 
 
29
  ## Evaluation
30
 
31
  Final checkpoint, speculative-decoding eval against the Kimi-K2.7-Code verifier
 
34
  | Draft | Real K2.7-Code traffic | K2.6-distribution held-out |
35
  |---|---|---|
36
  | **This model (final)** | **2.345** | 2.246 |
 
 
 
 
 
37
 
38
  ## Usage (vLLM)
39