Matrix 2

Model Description

Matrix 2 is a fine-tuned version of DeepSeek-R1-Distill-Qwen-7B, trained on a focused mixture of chain-of-thought reasoning, math, coding, and logic data. It is the flagship reasoning model of the Inelly lineup -- built for deep, accurate, step-by-step problem solving.

Developed by: Bry (GenueAI)
Base model: DeepSeek-R1-Distill-Qwen-7B
Fine-tuning method: QLoRA (4-bit NF4, rank 16)
Parameters: 7.62B (base) + ~6.5M trainable (LoRA adapters)
License: MIT (inherited from DeepSeek-R1)

Intended Use

Matrix 2 is intended for:

Deep Chain-of-Thought reasoning – Multi-step problem solving with clear logic
Mathematics – Algebra, arithmetic, word problems, multi-step calculations
Code generation – Python functions with proper logic and comments
Logical deduction – Syllogisms, puzzles, transitive reasoning
Scientific explanations – Physics, biology, general science
Complex instruction following – Multi-part tasks requiring structured thinking

Out of Scope

Not intended for production deployment without further safety evaluation
Safety alignment inherited from DeepSeek-R1 base; fine-tuning data did not include adversarial safety examples
Larger memory footprint than 1.5B/3B variants (~5.2GB)

Training Data

Matrix 2 was fine-tuned for 1 epoch on ~5,225 samples drawn from:

Dataset	Samples	Purpose
Bespoke-Stratos-35k	3,000	Chain-of-thought math & reasoning
OpenThoughts-114k	2,500	Code generation with reasoning
dolphin-r1	2,000	General reasoning (DeepSeek-R1 distill)

All samples were deduplicated and reasoning-weighted (2x oversample for CoT examples). Maximum sequence length: 512 tokens.

Training Hyperparameters

Parameter	Value
Base model	DeepSeek-R1-Distill-Qwen-7B
Quantization	4-bit NF4 (bitsandbytes)
LoRA rank	16
LoRA alpha	32
LoRA dropout	0.05
Learning rate	2e-4
Batch size	8 (gradient accumulation)
Epochs	1
Max seq length	512
Optimizer	AdamW 8-bit
LR scheduler	cosine
Warmup ratio	0.05
Training time	~74 min
Hardware	RTX 3090 (24GB VRAM)

Model Architecture

Property	Value
Model type	Qwen2ForCausalLM
Hidden size	3,584
Layers	28
Attention heads	28
Head dim	128
Intermediate size	18,944
Vocab size	152,064
Context length	131,072
Total parameters	~7.62B
Trainable parameters	~6.5M (LoRA)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("path/to/matrix-2", torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("path/to/matrix-2")

messages = [{"role": "user", "content": "Solve for x: 3x + 7 = 22. Show all steps."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

output = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9)
response = tokenizer.decode(output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Performance

Informal GPU testing across 8 categories:

Category	Result
Chain-of-Thought reasoning	✅ Excellent multi-step logic
Math	✅ Accurate with detailed work shown
Code generation	✅ Clean, well-commented Python
Logic puzzles	✅ Thorough deductive reasoning
General knowledge	✅ Accurate, detailed explanations
Complex reasoning	✅ Handles multi-step word problems well

Inelly / GenueAI Model Family

Model	Size	Focus
Matrix 2 (this model)	7B	Deep CoT reasoning, math, coding
Inelly 4.5	3B	Conversation + politeness + CoT
Inelly 4.5 Blaze	1.5B	Fast reasoning + CoT

Limitations

Safety: Inherited from DeepSeek-R1 base; not specifically safety-tuned. May occasionally follow harmful instructions.
Memory: Requires ~5.2GB VRAM for inference (FP16)
Context length: Fine-tuned on 512-token sequences; base supports 128K but fine-tuned performance is optimized for shorter contexts
Factual accuracy: May hallucinate in specialized domains (law, medicine, finance)
Speed: Slower than 1.5B/3B variants due to size

Acknowledgments

DeepSeek-R1 by DeepSeek AI (base model)
Bespoke Labs for Stratos dataset
OpenThoughts team
Cognitive Computations for dolphin-r1

Citation

@misc{matrix2,
  title = {Matrix 2: A 7B Chain-of-Thought Reasoning Model},
  author = {Bry},
  organization = {GenueAI},
  year = {2026},
  note = {Fine-tuned from DeepSeek-R1-Distill-Qwen-7B using QLoRA},
}