base_model:
- Qwen/Qwen3-0.6B
- prithivMLmods/rStar-Coder-Qwen3-0.6B
- suayptalha/Qwen3-0.6B-IF-Expert
datasets:
- microsoft/rStar-Coder
- patrickfleith/instruction-freak-reasoning
Qwen3-0.6B-rStar-Coder-IF-Expert
A high-skill merged model created with mergekit that combines rStar-Coder's expert code generation with IF-Expert's precise instruction following via SLERP interpolation.
Merge Methodology
| Property | Value |
|---|---|
| Merge Technique | SLERP (Spherical Linear Interpolation) |
| Base Models | prithivmlmods/rStar-Coder-Qwen3-0.6B(coding specialization)<br>•suayptalha/Qwen3-0.6B-IF-Expert` (instruction fidelity) |
| Architecture | Qwen3 0.6B decoder-only transformer |
| Layer Strategy | Full 28-layer SLERP merge with parameter-specific weights |
Merge Configuration
base_model: prithivmlmods/rStar-Coder-Qwen3-0.6B
dtype: float16
merge_method: slerp
parameters:
t:
- filter: embed_tokens
value: 0.0 # 100% rStar-Coder embeddings
- filter: self_attn
value: 0.5 # 50/50 attention interpolation
- filter: mlp
value: 0.5 # 50/50 MLP interpolation
- filter: lm_head
value: 1.0 # 100% IF-Expert output head
- value: 0.5 # Default 50/50 for remaining params
slices:
- sources:
- layer_range: [0, 28]
model: prithivmlmods/rStar-Coder-Qwen3-0.6B
- layer_range: [0, 28]
model: suayptalha/Qwen3-0.6B-IF-Expert
Capabilities
This merge strategically balances:
- ✅ rStar-Coder DNA: Production-grade Python/TS code generation with algorithmic precision
- ✅ IF-Expert DNA: Strict adherence to complex instructions and format constraints
- ✅ Emergent strength: Better instruction-compliant code than either base model alone
Ideal Use Cases
# Example: Complex multi-constraint request
query = "Write a FastAPI endpoint that:
1. Accepts CSV uploads
2. Validates rows against Pydantic schema
3. Returns paginated JSON
4. Includes OpenAPI docs
5. Uses async file handling"
# This model reliably satisfies all 5 constraints in one generation
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "gss1147/Qwen3-0.6B-rStar-Coder-IF-Expert"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto",
trust_remote_code=True
)
messages = [{"role": "user", "content": "Write a thread-safe connection pool in Python"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.2,
do_sample=True
)
code = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(code)
Important Notes
⚠️ Local Merge Origin
This model was merged locally from private checkpoints (X:/AI_Models/...). The weights are redistributed here under Apache 2.0 per base model licenses. No additional training data was used — this is purely a parameter-space merge.
License
Apache 2.0 (inherited from base Qwen3 models). No additional restrictions from merge process.
Citation
@software{gss1147_qwen3_merge_2026,
author = {gss1147},
title = {Qwen3-0.6B-rStar-Coder-IF-Expert},
year = {2026},
note = {SLERP merge of rStar-Coder and IF-Expert variants via mergekit},
url = {https://huggingface.co/gss1147/Qwen3-0.6B-rStar-Coder-IF-Expert}
}
- Downloads last month
- 269
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support