datasysdev commited on
Commit
f43f277
·
verified ·
1 Parent(s): c1c9ee7

Update logs/compare_all32_step1000.log

Browse files
Files changed (1) hide show
  1. logs/compare_all32_step1000.log +21 -0
logs/compare_all32_step1000.log ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Loading Qwen/Qwen3-4B-Instruct-2507 ...
2
+
3
+ Loaded ckpt step 1000 for layers [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34]
4
+ batch 1/2 done
5
+
6
+ ========================================================================
7
+ mass@K — fraction of teacher attention captured by retrieval set
8
+ raw_qk : exact top-K over head-mean-aggregated post-RoPE Q,K
9
+ learned: exact top-K over trained search projections (d=128)
10
+ ========================================================================
11
+
12
+ K method L03 L04 L05 L06 L07 L08 L09 L10 L11 L12 L13 L14 L15 L16 L17 L18 L19 L20 L21 L22 L23 L24 L25 L26 L27 L28 L29 L30 L31 L32 L33 L34 avg
13
+ 128 raw_qk 0.939 0.944 0.964 0.956 0.982 0.971 0.959 0.974 0.976 0.961 0.971 0.973 0.968 0.956 0.959 0.965 0.961 0.959 0.966 0.963 0.979 0.971 0.986 0.978 0.978 0.979 0.982 0.988 0.984 0.979 0.977 0.976 0.969
14
+ 128 learned 0.924 0.937 0.948 0.939 0.983 0.971 0.977 0.971 0.976 0.970 0.971 0.973 0.973 0.961 0.967 0.972 0.969 0.976 0.980 0.970 0.985 0.979 0.989 0.986 0.983 0.985 0.987 0.987 0.983 0.980 0.967 0.960 0.971
15
+
16
+ 256 raw_qk 0.986 0.986 0.993 0.990 0.996 0.994 0.992 0.995 0.996 0.991 0.995 0.996 0.995 0.992 0.993 0.995 0.993 0.993 0.994 0.993 0.996 0.994 0.997 0.996 0.995 0.995 0.997 0.998 0.997 0.995 0.995 0.995 0.994
17
+ 256 learned 0.977 0.982 0.986 0.981 0.996 0.993 0.995 0.992 0.995 0.993 0.993 0.994 0.995 0.991 0.994 0.995 0.994 0.996 0.997 0.994 0.997 0.996 0.998 0.997 0.997 0.997 0.997 0.997 0.996 0.995 0.991 0.990 0.993
18
+
19
+ Learned vs raw mass@K=128: 0.971 / 0.969 = 1.00×
20
+
21
+ Wrote /tmp/checkpoints_all32_d128_block_reserve_0_1_2_35/search_step_1000.compare_retrieval.json