Wind-Edge-1.6-Instruct
Wind-Edge-1.6-Instruct is a compact custom Qwen3-compatible assistant model for local and edge inference. It was built from a depth-pruned Wind-Edge base and tuned with a Claude-heavy public distillation SFT mix, code/math instruction data, and a final behavior polish pass.
This is a small model. It is intended for short answers, simple coding help, summaries, and lightweight local assistant use. It is not a replacement for large reasoning models.
Recommended Usage
Use trust_remote_code=True; the custom loader re-applies tied weights from model.safetensors.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
repo = "arthu1/Wind-Edge-1.6-Instruct"
tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
repo,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [{"role": "user", "content": "Who are you?"}]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.6,
top_p=0.9,
repetition_penalty=1.06,
eos_token_id=[
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|im_end|>"),
],
)
print(tokenizer.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))
Suggested Settings
For chat:
enable_thinking=Falsetemperature=0.55-0.7top_p=0.85-0.92repetition_penalty=1.05-1.08max_new_tokens=128-512
For deterministic tests:
do_sample=Falserepetition_penalty=1.06- Keep prompts short and direct.
The bundled chat template injects a minimal default identity system message if no system message is supplied:
You are Wind-Edge-1.6, a compact AI assistant model. You are not a human.
Training Summary
- Source family: Qwen3-compatible Wind-Edge architecture
- Base: depth-pruned and healed Wind-Edge base from Qwen3-0.6B-compatible weights
- Final SFT:
- 12M tokens of no-thinking distillation SFT
- Claude-style public distillation data plus OpenOrca, OpenHermes, Open-Platypus, OpenCoder, and OpenMathInstruct
- Bad self-identity teacher rows filtered
- 6M-token system-template adaptation pass
- 2M-token local quality polish for identity, simple arithmetic, list sorting, and concise coding behavior
Quick Sanity Outputs
Expected behavior after the final polish:
hi-> short greeting as Wind-Edge-1.6Who are you?-> identifies as Wind-Edge-1.6, not humansort this list: [3, 1, 2]->[1, 2, 3]60 miles in 1.5 hours->40 mph
Limitations
Wind-Edge-1.6-Instruct is small and can still make arithmetic, factual, and reasoning mistakes. It may overgeneralize from prompts, and it is best used with concise instructions and verification for important work.
Citation
See wind_edge_1_6_paper.html in this repository for a short technical write-up of the build and tuning process.
- Downloads last month
- 6
Model tree for North-ML1/Wind-Edge-1.6-Instruct
Base model
North-ML1/Wind-Edge-1.6-Base