---
license: other
license_name: qwen-research
license_link: https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct/blob/main/LICENSE
base_model: Qwen/Qwen2.5-Coder-3B-Instruct
tags:
  - cisco
  - ios-xe
  - network-automation
  - gguf
  - qwen2.5-coder
  - non-commercial
library_name: llama.cpp
pipeline_tag: text-generation
---

# Sentinel-NX — Cisco IOS-XE Config Assistant (V3.1, GGUF)

A small, edge-deployable Cisco **IOS-XE configuration assistant**: a QLoRA fine-tune of
Qwen2.5-Coder-3B-Instruct, merged and quantized to GGUF. It emits **strict, syntactically
valid** IOS-XE for exactly what's requested — no invented interfaces, IPs, loopbacks,
route-maps, `no shutdown`s, descriptions, or unrequested best-practice config.

Built with Qwen. **Non-commercial only** (see License).

Project / code / methodology: https://github.com/tnadmin1/Sentinel-NX

## Files

| File | Quant | Size | Use |
|---|---|---|---|
| `sentinel-nx-q8_0.gguf` | Q8_0 | ~3.1 GB | Primary — highest fidelity |
| `sentinel-nx-q6_k.gguf` | Q6_K | ~2.4 GB | Faster, near-lossless |

## Results

Manually-scored benchmarks; the hidden set uses entirely new interfaces, VLANs, ASNs,
IPs, and object names not seen in training (a generalization test).

**Hidden 20-prompt benchmark** (5 pts each):

| Model | Score |
|---|---|
| Base Qwen2.5-Coder-3B-Instruct | 58 / 100 |
| V2 | 71 / 100 |
| **V3.1** | **97 / 100** |

**Original 25-prompt benchmark** (4 pts each): Base 58 → V2 70 → V3 69 → **V3.1 93**.

## Usage

```bash
# Ollama (pull directly from this repo)
ollama run hf.co/tnadmin/Sentinel-NX:Q8_0
```

```bash
# llama.cpp
./llama-cli -m sentinel-nx-q8_0.gguf --temp 0 -c 4096 -cnv \
  -sys "You are a Cisco IOS-XE configuration assistant. Output only strict, valid configuration for exactly what is requested. Do not invent values."
```

**Strict behavior is prompt-conditioned.** The model suppresses over-completion when the
system prompt and request instruct it to (e.g. "Do not add descriptions, no shutdown,
spanning-tree, or anything not explicitly requested"). Use a strict prompt for best results.

## Known limitations

- OSPF router-id is occasionally emitted as `ip ospf <process> router-id <id>` under an
  interface instead of `router-id` under `router ospf <process>`. Targeted corrective data
  is the next iteration.

## Training

QLoRA (LoRA rank 16) on Qwen2.5-Coder-3B-Instruct, RTX 4070 12 GB. ~5,200 curated +
failure-driven remedial IOS-XE instruction pairs, built through three corrective rounds
(V2 → V3 → V3.1). See the GitHub repo for the full methodology.

## License & attribution

This model is a derivative of **Qwen2.5-Coder-3B-Instruct** and is distributed under the
**Qwen Research License — non-commercial use only**. Built with Qwen.
Copyright (c) Alibaba Cloud. All Rights Reserved.