mud-judgment β MUD Game Decision Engine (GGUF)
A fine-tuned Llama 3.2 3B Instruct model that makes real-time judgment calls for a bot playing Apocalypse VI: Reborn, a CircleMUD text game. The model handles decisions that scripted logic cannot: flee or fight, which path to take, whether to enter a dangerous area.
Model Details
| Property | Value |
|---|---|
| Base model | meta-llama/Llama-3.2-3B-Instruct |
| Fine-tuning method | QLoRA via Unsloth (rank=16, alpha=32) |
| Training framework | TRL SFTTrainer, completion-only loss |
| Training data | ~594 hand-crafted JSONL examples across 4 decision categories |
| Quantization | Q4_K_M (1.9 GB) and Q8_0 (3.2 GB) via llama.cpp |
| VRAM requirement | ~3 GB (Q4_K_M), ~4.5 GB (Q8_0) |
| Output format | Single command + one-line reasoning |
Files
| File | Size | Description |
|---|---|---|
mud-judgment-q4km.gguf |
1.9 GB | Q4_K_M quantization (recommended for β€6 GB VRAM) |
mud-judgment-q8.gguf |
3.2 GB | Q8_0 quantization (higher quality, needs ~5 GB VRAM) |
Modelfile |
β | Ollama Modelfile with Llama 3.2 chat template |
system_prompt.txt |
β | Required system prompt (must be included in every call) |
Quick Start β Ollama
# Download the GGUF and Modelfile, then:
ollama create mud-judgment -f Modelfile
# Call via API (system prompt is required):
curl -s http://localhost:11434/api/chat -d '{
"model": "mud-judgment",
"stream": false,
"messages": [
{"role": "system", "content": "<contents of system_prompt.txt>"},
{"role": "user", "content": "[SITUATION]\nDecision: COMBAT | Trigger: HP critical | State: 28hp 100mn 35mv | Level 7 | Buffs: none\n[/SITUATION]\n\nA forest wraith slashes YOU extremely hard.\nThat really did HURT!\nYour blood freezes as you hear a wraith'\''s death shriek."}
]
}'
Expected response:
flee
> HP critical at 28, wraith hitting extremely hard β cannot sustain this fight
Quick Start β llama.cpp / Python
# llama.cpp CLI
llama-cli -m mud-judgment-q4km.gguf --temp 0.3 --top-p 0.9 \
-p "<|start_header_id|>system<|end_header_id|>\n\n<system prompt><|eot_id|><|start_header_id|>user<|end_header_id|>\n\n<situation><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
# Python with llama-cpp-python
from llama_cpp import Llama
llm = Llama(model_path="mud-judgment-q4km.gguf", n_ctx=2048, n_gpu_layers=-1)
response = llm.create_chat_completion(
messages=[
{"role": "system", "content": open("system_prompt.txt").read()},
{"role": "user", "content": situation_text},
],
temperature=0.3,
top_p=0.9,
)
print(response["choices"][0]["message"]["content"])
Decision Types
The model handles 4 categories of judgment call:
| Type | When Called | Example Commands |
|---|---|---|
| COMBAT | HP critical, losing fight, buffs expired | flee, recall, rebuff |
| NAVIGATION | Stuck, maze, forced movement, no exits | north, extract, maze, forced |
| RISK | Unexplored exit, dangerous mob, death room | continue, avoid, unavailable, hostile |
| RECOVERY | Post-death, stuck, resource depletion | urgent, rebuff, abandon, extract |
Input Format
Every user message must contain a [SITUATION] block:
[SITUATION]
Decision: RISK | Trigger: Unexplored exit | State: 94hp 177mn 68mv | Level 5 | Buffs: invis, sanc
[/SITUATION]
Standing at the edge of a deep crevasse...
One false step and you'd plunge into the darkness below.
There appears to be no chance of surviving the deadly fall.
[EXITS: North East *Down*]
Output Format
Exactly two lines:
- A single command (game command or script command)
- A reasoning line prefixed with
>
avoid
> Death room β crevasse with "no chance of surviving" language, flagging for safe exploration later
Important Usage Notes
- System prompt is mandatory. The model was trained with the system prompt in every example. Without it, output quality degrades significantly.
- Temperature 0.3 is recommended. Higher temperatures produce inconsistent formatting.
- Do not use
ollama runwithout setting the system prompt first (/set system <prompt>). Use the chat API instead. - Modelfile must include the full Llama 3.2 chat template β see the included
Modelfilefor the correct template.
Training Details
- Method: QLoRA with Unsloth on WSL2 Ubuntu 24.04
- GPU: NVIDIA RTX 1000 Ada (6 GB VRAM) β training fits in ~4 GB
- Epochs: 2 (with 594 examples)
- Learning rate: 5e-5 with cosine scheduler
- Effective batch size: 8 (batch=1, grad_accum=8)
- Eval loss: 1.86 (steadily declining, no overfitting)
- Loss type: Completion-only (only trains on assistant response tokens)
- LoRA targets: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Limitations
- Trained specifically for Apocalypse VI: Reborn game mechanics. May not generalize to other MUDs without additional training data.
- The 594-example training set covers common scenarios well but edge cases (ITEM, UNEXPECTED types) have minimal coverage.
- Quantization to Q4_K_M introduces slight quality loss vs. the full-precision LoRA adapter.
Source Code
Training scripts, data generation, and the crawler that consumes this model are at: github.com/ninjarob/Apocalypse-VI-Projects
Citation
@misc{mud-judgment-2026,
title={mud-judgment: Fine-tuned Llama 3.2 3B for MUD Game Decision Making},
author={Robert Kevan},
year={2026},
url={https://huggingface.co/rkevan/mud-judgment}
}
Model tree for rkevan/mud-judgment
Base model
meta-llama/Llama-3.2-3B-Instruct