How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="frupniew/macaulay2-rag-coder-3b-gguf",
	filename="qwen2.5-coder-3b-instruct.Q4_K_M.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Macaulay2 RAG-Coder 3B (GGUF)

This repository contains the Q4_K_M GGUF quantized weights of a specialized coding model fine-tuned for Macaulay2 (a software system for research in algebraic geometry and commutative algebra).

Designed specifically for Edge Deployment, this model fits comfortably within a 4GB VRAM constraint while maintaining high fidelity in mathematical reasoning and domain-specific syntax.

🧠 Model Details & Engineering Choices

  • Base Model: Qwen2.5-Coder-3B-Instruct
  • Quantization: Q4_K_M (~1.9GB). Chosen over AWQ to avoid CUDA OOM during calibration on constrained hardware (Colab T4) and to guarantee safe inference on 4GB VRAM consumer GPUs using llama.cpp / vLLM.
  • Context Window: 4096 tokens (optimized for RAG context injection).
  • Domain Quirk Handled: Macaulay2's --script mode is notoriously silent and does not auto-print the last expression. This model was explicitly trained to append << result << endl; to ensure deterministic stdout capture in automated evaluation pipelines.

🚀 Quick Start (Ollama)

A Modelfile is included in this repository for immediate local testing.

ollama create m2-coder -f Modelfile
ollama run m2-coder "Write a script to compute the Groebner basis of ideal(x^2-y, y^2-z) in ZZ/5[x,y,z]"

🏗️ Architecture & Ecosystem

This model is the core generation engine of a broader Data-Centric RAG Pipeline: 🔌 LoRA Adapter (Safetensors): frupniew/macaulay2-rag-coder-3b-adapter 📚 RAG Knowledge Base: frupniew/macaulay2-rag-chunks 📝 Instruct Dataset: frupniew/macaulay2-qa-instruct

Downloads last month
245
GGUF
Model size
3B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for frupniew/macaulay2-rag-coder-3b-gguf

Base model

Qwen/Qwen2.5-3B
Quantized
(101)
this model

Dataset used to train frupniew/macaulay2-rag-coder-3b-gguf