DARWIN-Family
Collection
๋น๋๋ํํธ โข 35 items โข Updated โข 11
Darwin V8 ์๋ฆฌ์ฆ์ 2B ๊ฒฝ๋ ๋ชจ๋ธ Claude Opus 4.5/4.6 ๋ฐ Sonnet 4.6์ ์ถ๋ก ์คํ์ผ์ ์ฃผ์ ํ Qwen3.5-2B ๊ธฐ๋ฐ ๋ชจ๋ธ.
Qwen/Qwen3.5-2BFINAL-Bench/Darwin-2B-Opus-LoRAFINAL-Bench/Darwin-2B-Opus โ merged full-weight standalone| ํญ๋ชฉ | ๊ฐ |
|---|---|
| ๋ชจ๋ธ ํฌ๊ธฐ | 2.3B ํ๋ผ๋ฏธํฐ |
| ์ํคํ ์ฒ | Qwen3.5 (hybrid attention) |
| ํ์ต ๋ฐฉ์ | SFT with LoRA (all-linear, rank=16) |
| ํ์ต ๋ฐ์ดํฐ | 9,762 ์ํ (Claude Opus/Sonnet + ํ๊ตญ์ด reasoning) |
| ํ์ต ์๊ฐ | 29๋ถ (8รB200 GPU) |
| ์ต์ข Loss | 0.837 |
| Token Accuracy | 76.6% |
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "FINAL-Bench/Darwin-2B-Opus"
tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
)
messages = [
{"role": "user", "content": "2024๋
ํ๊ตญ ์ต์ ์๊ธ 9,860์์ด๋ค. ์ฃผ 40์๊ฐ ร 4์ฃผ ์๊ธ์?"}
]
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=800,
do_sample=False,
pad_token_id=tok.eos_token_id,
)
print(tok.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
[Qwen/Qwen3.5-2B] โโโโ Base ๋ชจ๋ธ (๋๊ฒฐ)
+
[9,762 Claude Opus/Sonnet + ํ๊ตญ์ด Reasoning ์ํ]
โ
[SFT Training]
- LoRA (all-linear, r=16, ฮฑ=32)
- Learning rate: 2e-4 (V8 rule: ร10 FullFT)
- 2 epochs, bf16, 8รB200 DDP
- Loss: 0.991 โ 0.837 (-15%)
- Token accuracy: 73.9% โ 76.6% (+2.7%p)
โ
[LoRA merge into base weights]
โ
[Darwin-2B-Opus] โ ์ด ๋ชจ๋ธ
| ์นดํ ๊ณ ๋ฆฌ | ์ํ ์ | % | ์ถ์ฒ |
|---|---|---|---|
| General Reasoning | 4,422 | 45% | Opus 4.5/4.6, Sonnet 4.6 |
| Math (English) | 1,960 | 20% | DeepSeek-v3.2 OpenR1-Math |
| Code (English) | 1,680 | 17% | DeepSeek-v3.2 CodeReasoning + GPT-5 Codex |
| Korean Thinking | 200 | 2% | Multilingual-Thinking-Korean |
| Korean Math | 1,500 | 15% | orca-math-word-problems-korean |
| ํฉ๊ณ (ํํฐ ํ) | 9,762 | 100% | - |
all-linear target, LR ร 10, rank=16์ผ๋ก ์ถฉ๋ถ| ์ ํ | ์ ๋ต | ๋น๊ณ |
|---|---|---|
| ์์ด ์ํ (๊ธฐ์ฐจ ์๋) | โ 80 km/h | LaTeX ๋จ๊ณ๋ณ ํ์ด |
| ์์ด ๋ ผ๋ฆฌ (ํค ๋น๊ต) | โ Carol | ์ถ์ด์จ ๋ช ์ |
| ์์ด ์ฝ๋ (์์ ํ๋ณ) | โ ์ ํ | docstring + ๋ณต์ก๋ ๋ถ์ |
| ํ๊ตญ์ด ์๊ธ ๊ณ์ฐ | โ 1,577,600์ | ๋จ๊ณ๋ณ ํ๊ตญ์ด ์ค๋ช |
| ํ๊ตญ์ด ์ฐ๋ฆฝ๋ฐฉ์ ์ | โ 1,200์ | ์ ์ ํ์ด + ๊ฒ์ฆ |
5/5 ์ ๋ต โ ์์ด+ํ๊ตญ์ด ๋ชจ๋ ์๋ฒฝ โญ
max_length=4,096๋ก ํ์ต๋จFINAL-Bench/Darwin-2B-Opus-LoRA โ ์ด ๋ชจ๋ธ์ LoRA ์ด๋ํฐ ๋จ๋
๋ฒ์ (67MB)FINAL-Bench/Darwin-2B-Opus-ONNX โ ๋ธ๋ผ์ฐ์ /WebGPU์ฉ ONNX ์์ํ ๋ฒ์ (์์ )Darwin-31B-Opus โ GPQA 85.9%Darwin-27B-Opus โ GPQA 86.9%Darwin-9B-OpusDarwin-4B-OpusDarwin V8 ยท Part of the evolutionary model series by FINAL-Bench