HackIDLE-NIST-Coder v1.1 (MLX 4-bit)

HackIDLE-NIST-Coder is a NIST-focused local model built from Qwen2.5-Coder-7B-Instruct and fine-tuned on a NIST cybersecurity corpus.

This repo is the MLX 4-bit build for Apple Silicon.

Use it as a helper. Do not treat it as a source of truth for exact control names, RMF step lists, or reference-architecture component names without checking the source publication.

What went into v1.1

Version 1.1 was trained on 530,912 examples from 596 NIST publications.

Compared with the first release, v1.1 added:

7,206 training examples
28 additional NIST documents
CSWP coverage, including CSF 2.0, Zero Trust, and Post-Quantum Cryptography material
cleanup for 6,150 malformed DOI links
removal of known broken-link markers in the training corpus

Training dataset:

ethanolivertroy/nist-cybersecurity-training

Training notes

Base model: mlx-community/Qwen2.5-Coder-7B-Instruct-4bit
Fine-tuning method: LoRA with MLX
Training iterations: 1,000, plus checkpoint recovery work
Final training loss: 1.420
Best validation loss: 1.512
Trainable parameters: 11.5M
Hardware used: M4 Max

Current eval status

I ran a small local smoke eval on April 22, 2026 against etgohome/hackidle-nist-coder:latest. In that local Ollama install, latest matched the v1.1 line.

Result: 1/5 cases passed.

The model stayed in-domain and handled a rough FIPS 140-2 vs. FIPS 140-3 comparison. It still missed exact grounding on:

SP 800-207 reference-architecture component names
the full SP 800-37 Rev. 2 RMF sequence
the exact CM-6 control name and description
stronger publication selection and logging/audit grounding for a contractor remote-access planning prompt

That is the important limitation. The model can sound close while still being wrong on exact NIST structure.

Good uses

This model is useful for:

brainstorming where to start in NIST
drafting first-pass explanations
surfacing likely document families
turning NIST-flavored questions into something a human can verify
local experimentation with domain fine-tuning on Apple Silicon

It is not reliable enough yet for:

exact control names
exact framework step ordering
exact reference-architecture component naming
answers that need source-level correctness on the first pass

Installation

pip install mlx-lm

Usage

from mlx_lm import load, generate

model, tokenizer = load("ethanolivertroy/HackIDLE-NIST-Coder-v1.1-MLX-4bit")

prompt = "Which NIST docs would you read before drafting a zero trust migration plan?"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True)

response = generate(model, tokenizer, prompt=prompt, max_tokens=500)
print(response)

Other formats

GGUF: ethanolivertroy/HackIDLE-NIST-Coder-v1.1-GGUF
Ollama: etgohome/hackidle-nist-coder

License

The base model is Qwen2.5-Coder-7B-Instruct, released under Apache 2.0. The NIST source publications used for the dataset are public domain U.S. government works. This model card uses Apache 2.0 for the model artifact and documents the NIST data source separately.

Citation

@misc{hackidle_nist_coder_v11_mlx,
  title = {HackIDLE-NIST-Coder v1.1 MLX 4-bit},
  author = {Troy, Ethan Oliver},
  year = {2025},
  version = {1.1},
  url = {https://huggingface.co/ethanolivertroy/HackIDLE-NIST-Coder-v1.1-MLX-4bit}
}

Downloads last month: 280

Safetensors

Model size

1B params

Tensor type

F16

U32

MLX

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ethanolivertroy/HackIDLE-NIST-Coder-v1.1-MLX-4bit

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-Coder-7B

Finetuned

mlx-community/Qwen2.5-Coder-7B-Instruct-4bit

Quantized

(5)

this model