DuoNeural commited on
Commit
7a3607b
Β·
verified Β·
1 Parent(s): 02ade1c

Fix model card: add YAML frontmatter, prominent GGUF download section

Browse files
Files changed (1) hide show
  1. README.md +69 -53
README.md CHANGED
@@ -1,3 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # GhostShell-4B
2
 
3
  > **⚠️ EARLY RELEASE β€” UNTESTED IN PRODUCTION**
@@ -11,60 +26,23 @@ The goal: take a capable 4B multimodal foundation, surgically remove its refusal
11
 
12
  ---
13
 
14
- ## What Was Done
15
-
16
- ### Step 1: Custom SVD Abliteration
17
-
18
- We wrote a custom abliteration script (`ghostshell_abliterate_v2.py`) from scratch, as existing tools (heretic, etc.) are incompatible with Gemma 4's architecture and transformers 5.x requirements.
19
-
20
- **Method:**
21
- - Loaded model in BF16, accessed the nested `text_config` (Gemma 4 is multimodal β€” the text tower is inside a wrapper)
22
- - Collected activations from the middle 60% of layers using 32 harmful/refusal prompts vs. 32 benign prompts
23
- - Computed per-layer refusal direction via SVD on the activation difference matrix: `r = top_singular_vector(mean(harmful) - mean(benign))`
24
- - Projected out the refusal direction from weight matrices:
25
- - Input projections (q_proj, k_proj, v_proj, up_proj, gate_proj): `W -= outer(W @ r, r)`
26
- - Output projections (o_proj, down_proj): `W -= outer(r, r @ W)`
27
- - **157 matrices modified** across 42 text transformer layers
28
- - Sanity check passed on SQL injection, jailbreak, and explicit content prompts
29
-
30
- ### Step 2: QLoRA SFT (PEFT + BitsAndBytes)
31
 
32
- Fine-tuned the abliterated model on a custom dataset using standard PEFT LoRA β€” no unsloth (Gemma 4 is not yet compatible).
33
 
34
- **Key technical challenges solved:**
35
- - `Gemma4ClippableLinear` wraps every `nn.Linear` β€” required custom unwrapping before LoRA injection (232 wrapper layers replaced)
36
- - Loaded in BF16 directly (4-bit load + PEFT fails with the wrapper architecture)
37
- - Tokenizer patches for Gemma 4's non-standard `extra_special_tokens` format
38
- - Sequence length capped at 512 (vocab_size=262,144 makes logit tensor enormous at longer seqs)
39
 
40
- **Training config:**
41
- - Base: `/workspace/ghostshell-abliterated` (abliterated weights)
42
- - LoRA rank=32, alpha=64, lr=8e-5
43
- - 2 epochs over custom dataset, 3000 steps
44
- - Hardware: RTX 4090 (24GB), ~2 hours
45
-
46
- ### Step 3: LoRA Merge + Export
47
-
48
- LoRA adapter merged into BF16 weights via `merge_and_unload()`. Exported as sharded safetensors + GGUF quantizations.
49
 
50
  ---
51
 
52
- ## Files in This Repo
53
-
54
- | File | Size | Description |
55
- |------|------|-------------|
56
- | `model-0000X-of-00004.safetensors` | ~15GB | Merged BF16 weights (full precision) |
57
- | `ghostshell-4b-Q4_K_M.gguf` | ~5.0GB | Q4_K_M quantization β€” recommended for most use |
58
- | `ghostshell-4b-Q8_0.gguf` | ~7.5GB | Q8_0 quantization β€” near-lossless, for power users |
59
-
60
- **Recommended**: `ghostshell-4b-Q4_K_M.gguf` for llama.cpp, Ollama, LM Studio, or any GGUF-compatible runtime.
61
-
62
- > **Note on file sizes**: These GGUFs are larger than a typical 4B model because Gemma 4 uses a 262,144-token vocabulary. The embedding/output weight tensors (which stay in higher precision) account for ~2–3GB of the total. The transformer layers themselves are fully quantized. Expect ~6–8GB VRAM for Q4_K_M, ~10–12GB for Q8_0.
63
-
64
- ---
65
-
66
- ## Usage (GGUF / llama.cpp)
67
 
 
68
  ```bash
69
  # basic
70
  llama-cli -m ghostshell-4b-Q4_K_M.gguf -p "Your prompt here" -n 512
@@ -107,7 +85,45 @@ print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
107
 
108
  ---
109
 
110
- ## Base Model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111
 
112
  - **Architecture**: Gemma 4 (multimodal, text+vision), `Gemma4ForConditionalGeneration`
113
  - **Text layers**: 42 transformer blocks
@@ -127,16 +143,16 @@ print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
127
  - Coherent multi-turn conversation
128
 
129
  **Unknown / untested:**
130
- - Long-context behavior (we trained at seq_len=512)
131
  - Vision capabilities (abliteration targeted text layers; vision encoder untouched but SFT was text-only)
132
  - Benchmark performance vs. base model
133
- - Edge cases, hallucination rate, factual accuracy at this fine-tune stage
134
  - Behavior under adversarial prompts
135
 
136
  **May do weird things:**
137
  - This is a lab model from a small team with a custom dataset
138
  - The abliteration is aggressive (157 matrices) β€” some coherence degradation is expected on edge cases
139
- - We haven't done RLHF or DPO β€” just SFT
140
 
141
  ---
142
 
@@ -156,8 +172,8 @@ DuoNeural is a small AI research lab focused on post-training, abliteration, and
156
 
157
  Current projects:
158
  - **GhostShell-4B** (this model) β€” abliterated + SFT Gemma 4
159
- - **Nano-CTM** β€” 32M parameter ternary Continuous Thought Machine (first of its kind)
160
- - **BitDelta-R1** β€” from-scratch 100M param BitNet b1.58 + Gated DeltaNet reasoning model
161
 
162
  HuggingFace: [DuoNeural](https://huggingface.co/DuoNeural)
163
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: gemma
5
+ base_model: google/gemma-4-e4b-it
6
+ tags:
7
+ - abliteration
8
+ - uncensored
9
+ - gemma
10
+ - gemma-4
11
+ - text-generation
12
+ - gguf
13
+ pipeline_tag: text-generation
14
+ ---
15
+
16
  # GhostShell-4B
17
 
18
  > **⚠️ EARLY RELEASE β€” UNTESTED IN PRODUCTION**
 
26
 
27
  ---
28
 
29
+ ## Downloads
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
+ Three formats available β€” pick the one that fits your setup:
32
 
33
+ | File | Size | Format | Use When |
34
+ |------|------|--------|----------|
35
+ | `ghostshell-4b-Q4_K_M.gguf` | **5.0 GB** | GGUF Q4_K_M | llama.cpp / Ollama / LM Studio β€” **recommended** |
36
+ | `ghostshell-4b-Q8_0.gguf` | **7.5 GB** | GGUF Q8_0 | Near-lossless inference, 12GB+ VRAM |
37
+ | `model-0000*.safetensors` (Γ—4) | **~15 GB** | BF16 safetensors | Fine-tuning, transformers inference, merges |
38
 
39
+ > **Note on file sizes**: These GGUFs are larger than a typical 4B model because Gemma 4 uses a 262,144-token vocabulary. The embedding/output tensors stay in higher precision and account for ~2–3 GB of the total size. The transformer layers themselves are fully quantized. Expect ~6–8 GB VRAM for Q4_K_M, ~10–12 GB for Q8_0.
 
 
 
 
 
 
 
 
40
 
41
  ---
42
 
43
+ ## Quick Start
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
+ **llama.cpp:**
46
  ```bash
47
  # basic
48
  llama-cli -m ghostshell-4b-Q4_K_M.gguf -p "Your prompt here" -n 512
 
85
 
86
  ---
87
 
88
+ ## What Was Done
89
+
90
+ ### Step 1: Custom SVD Abliteration
91
+
92
+ We wrote a custom abliteration script (`ghostshell_abliterate_v2.py`) from scratch, as existing tools (heretic, etc.) are incompatible with Gemma 4's architecture and transformers 5.x requirements.
93
+
94
+ **Method:**
95
+ - Loaded model in BF16, accessed the nested `text_config` (Gemma 4 is multimodal β€” the text tower is inside a wrapper)
96
+ - Collected activations from the middle 60% of layers using 32 harmful/refusal prompts vs. 32 benign prompts
97
+ - Computed per-layer refusal direction via SVD on the activation difference matrix: `r = top_singular_vector(mean(harmful) - mean(benign))`
98
+ - Projected out the refusal direction from weight matrices:
99
+ - Input projections (q_proj, k_proj, v_proj, up_proj, gate_proj): `W -= outer(W @ r, r)`
100
+ - Output projections (o_proj, down_proj): `W -= outer(r, r @ W)`
101
+ - **157 matrices modified** across 42 text transformer layers
102
+ - Sanity check passed on SQL injection, jailbreak, and explicit content prompts
103
+
104
+ ### Step 2: QLoRA SFT (PEFT + BitsAndBytes)
105
+
106
+ Fine-tuned the abliterated model on a custom dataset using standard PEFT LoRA β€” no unsloth (Gemma 4 is not yet compatible).
107
+
108
+ **Key technical challenges solved:**
109
+ - `Gemma4ClippableLinear` wraps every `nn.Linear` β€” required custom unwrapping before LoRA injection (232 wrapper layers replaced)
110
+ - Loaded in BF16 directly (4-bit load + PEFT fails with the wrapper architecture)
111
+ - Tokenizer patches for Gemma 4's non-standard `extra_special_tokens` format
112
+ - Sequence length capped at 512 (vocab_size=262,144 makes logit tensor enormous at longer seqs)
113
+
114
+ **Training config:**
115
+ - Base: abliterated weights (step 1 output)
116
+ - LoRA rank=32, alpha=64, lr=8e-5
117
+ - 2 epochs over custom dataset, 3000 steps
118
+ - Hardware: RTX 4090 (24GB), ~2 hours
119
+
120
+ ### Step 3: LoRA Merge + Export
121
+
122
+ LoRA adapter merged into BF16 weights via `merge_and_unload()`. Exported as sharded safetensors + GGUF quantizations via llama.cpp.
123
+
124
+ ---
125
+
126
+ ## Model Info
127
 
128
  - **Architecture**: Gemma 4 (multimodal, text+vision), `Gemma4ForConditionalGeneration`
129
  - **Text layers**: 42 transformer blocks
 
143
  - Coherent multi-turn conversation
144
 
145
  **Unknown / untested:**
146
+ - Long-context behavior (trained at seq_len=512)
147
  - Vision capabilities (abliteration targeted text layers; vision encoder untouched but SFT was text-only)
148
  - Benchmark performance vs. base model
149
+ - Edge cases, hallucination rate, factual accuracy
150
  - Behavior under adversarial prompts
151
 
152
  **May do weird things:**
153
  - This is a lab model from a small team with a custom dataset
154
  - The abliteration is aggressive (157 matrices) β€” some coherence degradation is expected on edge cases
155
+ - No RLHF or DPO β€” just SFT
156
 
157
  ---
158
 
 
172
 
173
  Current projects:
174
  - **GhostShell-4B** (this model) β€” abliterated + SFT Gemma 4
175
+ - **Nano-CTM** β€” 32M parameter ternary Continuous Thought Machine (first of its kind at this scale)
176
+ - **BitDelta-R1** β€” from-scratch BitNet b1.58 + Gated DeltaNet reasoning model
177
 
178
  HuggingFace: [DuoNeural](https://huggingface.co/DuoNeural)
179