DuoNeural commited on
Commit
d94657c
·
verified ·
1 Parent(s): e9748f2

add model card

Browse files
Files changed (1) hide show
  1. README.md +114 -0
README.md ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ base_model: Qwen/Qwen2.5-Math-7B-Instruct
6
+ tags:
7
+ - math
8
+ - reasoning
9
+ - qwen2.5
10
+ - lora
11
+ - duoneural
12
+ - fine-tuned
13
+ datasets:
14
+ - HuggingFaceTB/finemath
15
+ - AI-MO/NuminaMath-CoT
16
+ model-index:
17
+ - name: Qwen2.5-Math-NeuralMath-7B
18
+ results: []
19
+ ---
20
+
21
+ # Qwen2.5-Math-NeuralMath-7B
22
+
23
+ **DuoNeural** | Math Reasoning Fine-Tune | April 2026
24
+
25
+ A fine-tuned version of [Qwen/Qwen2.5-Math-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct) with supervised fine-tuning on curated math reasoning data, targeting improved step-by-step problem solving on competition and olympiad-level math.
26
+
27
+ ## What's Different
28
+
29
+ The base Qwen2.5-Math-7B-Instruct is already a strong math model. This fine-tune focuses on:
30
+
31
+ - **Deeper chain-of-thought**: trained on longer, more structured reasoning traces
32
+ - **Competition math exposure**: AMC/AIME/olympiad problems via NuminaMath-CoT
33
+ - **Format consistency**: reliable `\boxed{}` answer formatting across problem types
34
+
35
+ ## Quickstart
36
+
37
+ ```python
38
+ from transformers import AutoTokenizer, AutoModelForCausalLM
39
+ import torch
40
+
41
+ model = AutoModelForCausalLM.from_pretrained(
42
+ "DuoNeural/Qwen2.5-Math-NeuralMath-7B",
43
+ torch_dtype=torch.bfloat16,
44
+ device_map="auto"
45
+ )
46
+ tokenizer = AutoTokenizer.from_pretrained("DuoNeural/Qwen2.5-Math-NeuralMath-7B")
47
+
48
+ prompt = """Solve the following math problem step by step.
49
+
50
+ Problem: Find all positive integers n such that n² + 1 is divisible by n + 1.
51
+
52
+ Solution:"""
53
+
54
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
55
+ output = model.generate(**inputs, max_new_tokens=512, temperature=0.1, do_sample=True)
56
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
57
+ ```
58
+
59
+ ## GGUF / Ollama / LM Studio
60
+
61
+ Pre-quantized GGUFs available in the `gguf/` folder of this repo:
62
+
63
+ | File | Size | Use case |
64
+ |------|------|----------|
65
+ | `neuromath-7b-q4_k_m.gguf` | 4.7GB | Recommended — best quality/speed tradeoff |
66
+ | `neuromath-7b-q8_0.gguf` | 8.1GB | High quality, needs 10GB+ VRAM/RAM |
67
+ | `neuromath-7b-f16.gguf` | 15GB | Full precision, GPU only |
68
+
69
+ ### Ollama
70
+
71
+ ```bash
72
+ # Create Modelfile
73
+ cat > Modelfile << 'EOF'
74
+ FROM ./neuromath-7b-q4_k_m.gguf
75
+ SYSTEM "You are an expert mathematician. Solve problems step by step, showing all work clearly. Put your final answer in \\boxed{}."
76
+ PARAMETER temperature 0.1
77
+ PARAMETER num_ctx 4096
78
+ EOF
79
+
80
+ ollama create neuromath-7b -f Modelfile
81
+ ollama run neuromath-7b "What is the sum of all prime numbers less than 100?"
82
+ ```
83
+
84
+ ### LM Studio
85
+
86
+ Download `neuromath-7b-q4_k_m.gguf`, load in LM Studio. Set system prompt:
87
+ > "You are an expert mathematician. Solve problems step by step, showing all work. Put your final answer in \\boxed{}."
88
+
89
+ ## Training Details
90
+
91
+ | Setting | Value |
92
+ |---------|-------|
93
+ | Base model | Qwen/Qwen2.5-Math-7B-Instruct |
94
+ | Method | QLoRA SFT (4-bit base, LoRA rank 16) |
95
+ | Training tokens | ~1.26M (3 epochs over curated math dataset) |
96
+ | LoRA alpha | 32 |
97
+ | LoRA targets | q, k, v, o, gate, up, down projections |
98
+ | Hardware | NVIDIA A100 80GB |
99
+ | Framework | Unsloth + HuggingFace Transformers |
100
+ | Sequence length | 1024 tokens |
101
+
102
+ ## Limitations
103
+
104
+ - Trained on English math problems; performance on other languages untested
105
+ - Very long multi-step proofs (>1024 tokens) may be truncated during generation
106
+ - This is the SFT-only checkpoint; GRPO reinforcement learning phase is planned as a follow-up
107
+ - Not intended for general conversation — math reasoning only
108
+
109
+ ## DuoNeural
110
+
111
+ DuoNeural is an AI research lab focused on post-training techniques, efficient architectures, and edge deployment. We document our wins, losses, and learnings publicly.
112
+
113
+ - GitHub: [DuoNeural](https://github.com/DuoNeural)
114
+ - HuggingFace: [DuoNeural](https://huggingface.co/DuoNeural)