init Style_Masks repo
Browse files
README.md
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
tags: [gemma-4, 31b, style-mask, fisher, j-line, merge]
|
| 3 |
+
---
|
| 4 |
+
# Esobold Style_Masks
|
| 5 |
+
|
| 6 |
+
Per-weight **contrastive style-Fisher ratios** for style-aware 31B merges (J-line / target-side style mask).
|
| 7 |
+
|
| 8 |
+
Each subfolder holds a `style_ratio.pt` (bf16, target LM-linear keys only: q/k/v/o_proj + mlp
|
| 9 |
+
gate/up/down in `language_model.layers.*`) + `ratio_layer_summary.json`.
|
| 10 |
+
|
| 11 |
+
`r_w = F_A / (F_A + F_B)`, where `F_M = relu(Fisher(M under S+ style prompt) - Fisher(M under S-))`
|
| 12 |
+
over matched generations. `r_w > 0.5` => weight carries more good-style signal in model A;
|
| 13 |
+
neutral 0.5 below a combined-Fisher floor. Consumed by `04_merge_style.py` (trim-then-weight TIES).
|
| 14 |
+
|
| 15 |
+
Pipeline: `instruct_mask/style_mask_j1/`. Public so multiple ~35GB masks fit without private-storage limits.
|
| 16 |
+
Initialized 2026-05-29T01:01:19Z.
|