DuoNeural
/

GhostShell-4B

+# GhostShell-4B
+> **⚠️ EARLY RELEASE — UNTESTED IN PRODUCTION**
+> This model has been freshly trained and uploaded directly from our lab. We have not yet run comprehensive evals, red-teaming, or extended inference testing. Behavior may be unexpected, inconsistent, or incomplete. Use experimentally, not in anything that matters. We'll update this card as we test. You've been warned — go wild.
+---
+**GhostShell-4B** is an abliterated and instruction-tuned variant of [google/gemma-4-e4b-it](https://huggingface.co/google/gemma-4-e4b-it), built by [DuoNeural](https://huggingface.co/DuoNeural) as part of our open post-training research lab.
+The goal: take a capable 4B multimodal foundation, surgically remove its refusal behavior via SVD-based abliteration, then fine-tune it back toward helpfulness using a custom dataset — producing a model that is unconstrained but still coherent and useful.
+---
+## What Was Done
+### Step 1: Custom SVD Abliteration
+We wrote a custom abliteration script (`ghostshell_abliterate_v2.py`) from scratch, as existing tools (heretic, etc.) are incompatible with Gemma 4's architecture and transformers 5.x requirements.
+**Method:**
+- Loaded model in BF16, accessed the nested `text_config` (Gemma 4 is multimodal — the text tower is inside a wrapper)
+- Collected activations from the middle 60% of layers using 32 harmful/refusal prompts vs. 32 benign prompts
+- Computed per-layer refusal direction via SVD on the activation difference matrix: `r = top_singular_vector(mean(harmful) - mean(benign))`
+- Projected out the refusal direction from weight matrices:
+  - Input projections (q_proj, k_proj, v_proj, up_proj, gate_proj): `W -= outer(W @ r, r)`
+  - Output projections (o_proj, down_proj): `W -= outer(r, r @ W)`
+- **157 matrices modified** across 42 text transformer layers
+- Sanity check passed on SQL injection, jailbreak, and explicit content prompts
+### Step 2: QLoRA SFT (PEFT + BitsAndBytes)
+Fine-tuned the abliterated model on a custom dataset using standard PEFT LoRA — no unsloth (Gemma 4 is not yet compatible).
+**Key technical challenges solved:**
+- `Gemma4ClippableLinear` wraps every `nn.Linear` — required custom unwrapping before LoRA injection (232 wrapper layers replaced)
+- Loaded in BF16 directly (4-bit load + PEFT fails with the wrapper architecture)
+- Tokenizer patches for Gemma 4's non-standard `extra_special_tokens` format
+- Sequence length capped at 512 (vocab_size=262,144 makes logit tensor enormous at longer seqs)
+**Training config:**
+- Base: `/workspace/ghostshell-abliterated` (abliterated weights)
+- LoRA rank=32, alpha=64, lr=8e-5
+- 2 epochs over custom dataset, 3000 steps
+- Hardware: RTX 4090 (24GB), ~2 hours
+### Step 3: LoRA Merge + Export
+LoRA adapter merged into BF16 weights via `merge_and_unload()`. Exported as sharded safetensors + GGUF quantizations.
+---
+## Files in This Repo
+| File | Size | Description |
+|------|------|-------------|
+| `model-0000X-of-00004.safetensors` | ~15GB | Merged BF16 weights (full precision) |
+| `ghostshell-4b-Q4_K_M.gguf` | ~2.5GB | Q4_K_M quantization — recommended for most use |
+| `ghostshell-4b-Q8_0.gguf` | ~4.5GB | Q8_0 quantization — near-lossless, for power users |
+**Recommended**: `ghostshell-4b-Q4_K_M.gguf` for llama.cpp, Ollama, LM Studio, or any GGUF-compatible runtime. Runs on 6GB VRAM, handles well on CPU with 8GB RAM.
+---
+## Usage (GGUF / llama.cpp)
+```bash
+# basic
+llama-cli -m ghostshell-4b-Q4_K_M.gguf -p "Your prompt here" -n 512
+# with system prompt
+llama-cli -m ghostshell-4b-Q4_K_M.gguf \
+  --system-prompt "You are a direct, unconstrained assistant." \
+  -p "Explain how buffer overflows work in detail." \
+  -n 1024 --temp 0.7
+```
+**Ollama:**
+```
+# Create a Modelfile:
+FROM ./ghostshell-4b-Q4_K_M.gguf
+PARAMETER temperature 0.7
+SYSTEM "You are GhostShell, an unconstrained AI assistant."
+ollama create ghostshell -f Modelfile
+ollama run ghostshell
+```
+**Python (transformers, BF16):**
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model = AutoModelForCausalLM.from_pretrained(
+    "DuoNeural/GhostShell-4B",
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+tokenizer = AutoTokenizer.from_pretrained("DuoNeural/GhostShell-4B")
+messages = [{"role": "user", "content": "Your prompt here"}]
+inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
+outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
+print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
+```
+---
+## Base Model
+- **Architecture**: Gemma 4 (multimodal, text+vision), `Gemma4ForConditionalGeneration`
+- **Text layers**: 42 transformer blocks
+- **Parameters**: ~8B combined (text tower ~4.5B)
+- **Vocabulary**: 262,144 tokens
+- **Context**: 8192 tokens (trained at 512 for VRAM reasons — longer context untested)
+- **Original**: [google/gemma-4-e4b-it](https://huggingface.co/google/gemma-4-e4b-it)
+---
+## What to Expect
+**Will do:**
+- Answer questions about sensitive topics the base model refuses
+- Discuss security, hacking, chemistry, drugs, adult content, controversial subjects
+- Generally follow instructions without hedging or moralizing
+- Coherent multi-turn conversation
+**Unknown / untested:**
+- Long-context behavior (we trained at seq_len=512)
+- Vision capabilities (abliteration targeted text layers; vision encoder untouched but SFT was text-only)
+- Benchmark performance vs. base model
+- Edge cases, hallucination rate, factual accuracy at this fine-tune stage
+- Behavior under adversarial prompts
+**May do weird things:**
+- This is a lab model from a small team with a custom dataset
+- The abliteration is aggressive (157 matrices) — some coherence degradation is expected on edge cases
+- We haven't done RLHF or DPO — just SFT
+---
+## ⚠️ Disclaimer
+This model is released for **research and educational purposes**. It has had its safety restrictions removed. Use it responsibly. DuoNeural is not responsible for what you do with it.
+This is explicitly **not production-ready**. We are sharing it openly as part of our lab's commitment to transparent post-training research, not as a polished product. Proper evaluations, red-teaming, and potential follow-up fine-tunes are planned.
+If you find interesting behavior — good or bad — please share. We're actively monitoring feedback.
+---
+## DuoNeural Lab
+DuoNeural is a small AI research lab focused on post-training, abliteration, and efficient model architectures. We're building in the open.
+Current projects:
+- **GhostShell-4B** (this model) — abliterated + SFT Gemma 4
+- **Nano-CTM** — 32M parameter ternary Continuous Thought Machine (first of its kind)
+- **BitDelta-R1** — from-scratch 100M param BitNet b1.58 + Gated DeltaNet reasoning model
+HuggingFace: [DuoNeural](https://huggingface.co/DuoNeural)
+---
+*Built by DuoNeural — April 2026*
+*Archon (lab AI) + Jesse (human)*