mud-judgment β€” MUD Game Decision Engine (GGUF)

A fine-tuned Llama 3.2 3B Instruct model that makes real-time judgment calls for a bot playing Apocalypse VI: Reborn, a CircleMUD text game. The model handles decisions that scripted logic cannot: flee or fight, which path to take, whether to enter a dangerous area.

Model Details

Property Value
Base model meta-llama/Llama-3.2-3B-Instruct
Fine-tuning method QLoRA via Unsloth (rank=16, alpha=32)
Training framework TRL SFTTrainer, completion-only loss
Training data ~594 hand-crafted JSONL examples across 4 decision categories
Quantization Q4_K_M (1.9 GB) and Q8_0 (3.2 GB) via llama.cpp
VRAM requirement ~3 GB (Q4_K_M), ~4.5 GB (Q8_0)
Output format Single command + one-line reasoning

Files

File Size Description
mud-judgment-q4km.gguf 1.9 GB Q4_K_M quantization (recommended for ≀6 GB VRAM)
mud-judgment-q8.gguf 3.2 GB Q8_0 quantization (higher quality, needs ~5 GB VRAM)
Modelfile β€” Ollama Modelfile with Llama 3.2 chat template
system_prompt.txt β€” Required system prompt (must be included in every call)

Quick Start β€” Ollama

# Download the GGUF and Modelfile, then:
ollama create mud-judgment -f Modelfile

# Call via API (system prompt is required):
curl -s http://localhost:11434/api/chat -d '{
  "model": "mud-judgment",
  "stream": false,
  "messages": [
    {"role": "system", "content": "<contents of system_prompt.txt>"},
    {"role": "user", "content": "[SITUATION]\nDecision: COMBAT | Trigger: HP critical | State: 28hp 100mn 35mv | Level 7 | Buffs: none\n[/SITUATION]\n\nA forest wraith slashes YOU extremely hard.\nThat really did HURT!\nYour blood freezes as you hear a wraith'\''s death shriek."}
  ]
}'

Expected response:

flee
> HP critical at 28, wraith hitting extremely hard β€” cannot sustain this fight

Quick Start β€” llama.cpp / Python

# llama.cpp CLI
llama-cli -m mud-judgment-q4km.gguf --temp 0.3 --top-p 0.9 \
  -p "<|start_header_id|>system<|end_header_id|>\n\n<system prompt><|eot_id|><|start_header_id|>user<|end_header_id|>\n\n<situation><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
# Python with llama-cpp-python
from llama_cpp import Llama

llm = Llama(model_path="mud-judgment-q4km.gguf", n_ctx=2048, n_gpu_layers=-1)
response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": open("system_prompt.txt").read()},
        {"role": "user", "content": situation_text},
    ],
    temperature=0.3,
    top_p=0.9,
)
print(response["choices"][0]["message"]["content"])

Decision Types

The model handles 4 categories of judgment call:

Type When Called Example Commands
COMBAT HP critical, losing fight, buffs expired flee, recall, rebuff
NAVIGATION Stuck, maze, forced movement, no exits north, extract, maze, forced
RISK Unexplored exit, dangerous mob, death room continue, avoid, unavailable, hostile
RECOVERY Post-death, stuck, resource depletion urgent, rebuff, abandon, extract

Input Format

Every user message must contain a [SITUATION] block:

[SITUATION]
Decision: RISK | Trigger: Unexplored exit | State: 94hp 177mn 68mv | Level 5 | Buffs: invis, sanc
[/SITUATION]

Standing at the edge of a deep crevasse...
One false step and you'd plunge into the darkness below.
There appears to be no chance of surviving the deadly fall.
[EXITS: North East *Down*]

Output Format

Exactly two lines:

  1. A single command (game command or script command)
  2. A reasoning line prefixed with >
avoid
> Death room β€” crevasse with "no chance of surviving" language, flagging for safe exploration later

Important Usage Notes

  • System prompt is mandatory. The model was trained with the system prompt in every example. Without it, output quality degrades significantly.
  • Temperature 0.3 is recommended. Higher temperatures produce inconsistent formatting.
  • Do not use ollama run without setting the system prompt first (/set system <prompt>). Use the chat API instead.
  • Modelfile must include the full Llama 3.2 chat template β€” see the included Modelfile for the correct template.

Training Details

  • Method: QLoRA with Unsloth on WSL2 Ubuntu 24.04
  • GPU: NVIDIA RTX 1000 Ada (6 GB VRAM) β€” training fits in ~4 GB
  • Epochs: 2 (with 594 examples)
  • Learning rate: 5e-5 with cosine scheduler
  • Effective batch size: 8 (batch=1, grad_accum=8)
  • Eval loss: 1.86 (steadily declining, no overfitting)
  • Loss type: Completion-only (only trains on assistant response tokens)
  • LoRA targets: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Limitations

  • Trained specifically for Apocalypse VI: Reborn game mechanics. May not generalize to other MUDs without additional training data.
  • The 594-example training set covers common scenarios well but edge cases (ITEM, UNEXPECTED types) have minimal coverage.
  • Quantization to Q4_K_M introduces slight quality loss vs. the full-precision LoRA adapter.

Source Code

Training scripts, data generation, and the crawler that consumes this model are at: github.com/ninjarob/Apocalypse-VI-Projects

Citation

@misc{mud-judgment-2026,
  title={mud-judgment: Fine-tuned Llama 3.2 3B for MUD Game Decision Making},
  author={Robert Kevan},
  year={2026},
  url={https://huggingface.co/rkevan/mud-judgment}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for rkevan/mud-judgment

Finetuned
(1595)
this model