News 🔥

2026/01/15: 🤗 Released kanana-2-30b-a3b-2601 HF model weights.
2026/01/15: 📕 Published blog posts (pre-training, post-training) about the development of Kanana-2 models.
2025/12/19: 🤗 Released kanana-2-30b-a3b HF model weights and publised a teaser blog.

Kanana-2 Highlights

Kanana-2, the latest open-source evolution of the Kanana model family, is designed specifically for Agentic AI, presenting substantial enhancements in tool calling, complex instruction following, and logical reasoning. This new version adopts a cutting-edge architecture featuring MLA (Multi-head Latent Attention) and MoE (Mixture of Experts). These innovations allow the model to utilize significantly fewer active parameters compared to the previous 32.5B model while delivering superior performance and ensuring high throughput. Furthermore, the model natively supports context lengths of up to 32,768 tokens, enabling it to maintain coherence when handling extensive documents or long-context interactions.

In addition, Kanana-2 now supports 6 languages, covering Korean, English, Japanese, Chinese, Thai, and Vietnamese. To support this expansion, Kanana-2 utilizes a newly trained tokenizer that demonstrates superior tokenization efficiency across these languages, including an improvement of over 30% specifically for Korean. Finally, to address advanced problem-solving needs, Kanana-2 introduces reasoning models capable of deliberate thinking and reasoning, achieving significantly enhanced performance in downstream tasks, especially when tackling hard problems.

No Kakao user data was used for either pre-training or post-training.

Model Overview

kanana-2-30b-a3b series has the following features:

Total Parameters: 30B
Activated Parameters: 3B
Number of Layers: 48
Number of Dense Layers: 1
Number of Experts: 128
Number of Selected Experts: 6
Number of Shared Experts: 2
Attention Mechanism: MLA
Vocabulary Size: 128256
Context Length: 32,768

Model Downloads

Model	Download
kanana-2-30b-a3b-base-2601^*	🤗 HuggingFace
kanana-2-30b-a3b-mid-2601^*	🤗 HuggingFace
kanana-2-30b-a3b-instruct-2601	🤗 HuggingFace
kanana-2-30b-a3b-thinking-2601	🤗 HuggingFace

_{^* We are releasing the kanana-2-30b-a3b-base-2601 (prior to mid-training) checkpoint to contribute to the research community.

Note: kanana-2-30b-a3b-mid-2601 is identical to kanana-2-30b-a3b-base.}

Performance

Base model evaluation results

Benchmark	Metric	Shot	kanana-2-30b-a3b-mid-2601	kanana-2-30b-a3b-base-2601	kanana-1.5-32.5b-base	Qwen3-30B-A3B-Base^*
General Tasks
MMLU	acc	5	75.44	74.83	76.76	81.14
MMLU-Pro	acc	5	56.14	52.61	52.40	61.83
BBH	acc	3	79.76	76.46	81.54	79.97
SimpleQA^†	acc	5	29.70	29.13	26.95	26.47
Mathematics Tasks
MATH	em	4	54.40	48.86	47.68	62.58
GSM8K	em	8	82.71	76.57	85.14	88.10
Coding Tasks
HumanEval	pass@1	0	75.29	71.34	75.59	53.32
MBPP	pass@1	3	62.39	60.21	65.96	72.58
Korean Tasks
KMMLU	acc	5	62.15	61.98	61.56	62.25
KoSimpleQA^†	acc	5	49.70	49.40	45.70	26.33
HAE-RAE Bench (v1.0)	acc	5	88.73	88.91	90.65	72.04
MATH-Ko^‡	em	4	54.07	45.58	47.42	58.20
GSM8K-Ko^‡	em	8	77.48	70.43	81.43	88.10
MBPP-Ko^§	pass@1	3	61.55	57.29	65.41	66.84
Long Context Tasks
RULER-4K	acc	0	93.09	92.49	86.39	94.32
RULER-8K	acc	0	92.29	92.14	90.16	92.16
RULER-16K	acc	0	90.73	90.01	85.88	91.28
RULER-32K	acc	0	88.63	87.92	81.62	88.32

_{^* Evaluated using an internal evaluation toolkit.

^† Evaluated in Multiple Choice Question Answering (MCQA) format with 10 options.

^‡ Subsets from HRM8K (MATH, GSM8K).

^§ Internally translated to Korean.}

Instruct model evaluation results

Benchmark	Metric	kanana-2-30b-a3b-instruct-2601	kanana-2-30b-a3b-instruct	kanana-1.5-32.5b-instruct	Qwen3-30B-A3B-Instruct-2507^*	Qwen3-30B-A3B (non-thinking)^*
Chat
MT-Bench	judge^†	8.30	8.42	8.23	8.71	8.38
KoMT-Bench	judge^†	8.21	8.24	7.94	8.49	7.89
Instruction Following
IFEval	prompt strict	87.25	84.47	79.48	82.62	84.10
IFBench	prompt strict	48.30	41.84	38.78	30.27	29.25
Multi-IF (EN)	acc	77.88	75.81	68.51	77.93	81.03
Multi-Challenge	acc	35.16	34.80	19.05	41.76	27.84
Tool Calling
BFCL-v3 (Live^‡)	pass@1	76.66	74.30	68.74	73.93	69.14
BFCL-v3 (Multi-Turn^‡)	pass@1	38.63	35.38	11.38	38.77	11.88
Code Generation
HumanEval+	pass@1	81.10	79.88	79.88	86.59	87.20
MBPP+	pass@1	73.02	73.81	71.96	75.13	75.13
Mathematics
GSM8K	em	93.10	91.89	91.58	93.56	93.33
MATH	acc	88.56	86.26	77.92	90.96	87.20
Reasoning & Knowledge
MMLU	em	81.61	80.80	82.75	87.13	85.60
KMMLU	em	68.26	67.32	65.75	67.56	63.49
GPQA Diamond	pass@1	52.53	42.93	42.42	54.55	50.51
HAERAE-Bench (v1.0)	em	75.57	75.57	65.34	53.41	57.39

_{^* Evaluated using an internal evaluation toolkit.

^† Evaluated using gpt-4o-2024-08-06 as the judge model.

^‡ Live denotes the average score of 6 live benchmarks, and Multi-Turn denotes the average score of 4 multi-turn benchmarks.}

Reasoning model evaluation results

Benchmark	Metric	kanana-2-30b-a3b-thinking-2601	kanana-2-30b-a3b-thinking	Qwen3-30B-A3B-Thinking-2507^*	Qwen3-30B-A3B (thinking)^*
Reasoning & Knowledge
MMLU-Pro	pass@1	74.2	75.3	80.8	78.5
GPQA Diamond	pass@1	57.8	61.3	70.6	62.6
Competition Math
AIME 2025	pass@1	74.0	72.7	82.3	70.7
AIME 2024	pass@1	79.0	78.3	91.0	82.7
AIME 2024-Ko^†	pass@1	75.0	25.3	80.3	72.3
Code Generation
LiveCodeBench	pass@1	58.8	60.8	68.3	62.3
LiveCodeBench-Ko^‡	pass@1	51.2	9.4	66.3^¶	61.5^¶
Instruction Following
IFEval	prompt strict	82.2	82.2	87.8	86.1
IFBench	prompt strict	47.8	42.3	47.6	36.7
Tool Calling
BFCL-v3 (Live^§)	pass@1	75.9	75.6	82.9	80.3
BFCL-v3 (Multi-Turn^§)	pass@1	43.7	34.3	53.6	35.6

_{^* Evaluated using an internal evaluation toolkit.

^† Korean translation of AIME 2024 sourced from MCLM.

^‡ Internally translated to Korean.

^§ Live denotes the average score of 6 live benchmarks, and Multi-Turn denotes the average score of 4 multi-turn benchmarks.

^¶ Most responses were generated in English.}

Deployment

For optimal results with the reasoning model, please adhere to the default parameters: temperature=0.6, top_p=0.95, top_k=20. We strongly advise against greedy decoding, as it may lead to performance degradation and infinite repetition loops.

vLLM

vLLM is a fast and memory-optimized engine designed for high-performance LLM inference and serving.

For kanana-2-30b-a3b-instruct-2601,

vllm serve kakaocorp/kanana-2-30b-a3b-instruct-2601 --enable-auto-tool-choice --tool-call-parser hermes

For kanana-2-30b-a3b-thinking-2601,

vllm serve kakaocorp/kanana-2-30b-a3b-thinking-2601 --reasoning-parser deepseek_r1 --enable-auto-tool-choice --tool-call-parser hermes

SGLang

SGLang is a high-efficiency framework for serving LLMs and VLMs, enabling easy deployment of OpenAI-compatible API servers.

For kanana-2-30b-a3b-instruct-2601,

python3 -m sglang.launch_server --model-path kakaocorp/kanana-2-30b-a3b-instruct-2601 --tool-call-parser qwen

For kanana-2-30b-a3b-thinking-2601,

python3 -m sglang.launch_server --model-path kakaocorp/kanana-2-30b-a3b-thinking-2601 --reasoning-parser deepseek-r1 --tool-call-parser qwen

Processing 32K+ Length

Currently, the config.json uploaded to HuggingFace is configured for token lengths of 32,768 or less. To process tokens beyond this length, YaRN must be applied. By updating the config.json with the following parameters, you can apply YaRN to handle token sequences up to 128K in length:

"rope_scaling": {
    "beta_fast": 32,
    "beta_slow": 1,
    "factor": 4.0,
    "mscale": 1.0,
    "mscale_all_dim": 1.0,
    "original_max_position_embeddings": 32768,
    "type": "yarn",
},

Passing command line arguments for deployment:

vllm

vllm serve ... --hf-overrides '{"max_position_embeddings": 131072, "rope_scaling": {"rope_type":"deepseek_yarn","factor":4.0,"beta_fast":32,"beta_slow":1,"mscale":1.0,"mscale_all_dim":1.0,"original_max_position_embeddings":32768}}'

sglang

python3 -m sglang.launch_server ... --json-model-override-args '{"max_position_embeddings":131072, "rope_scaling":{"rope_type":"deepseek_yarn","factor":4.0,"beta_fast":32,"beta_slow":1,"mscale":1.0,"mscale_all_dim":1.0,"original_max_position_embeddings":32768}}'

Most leading open-source implementations of static YaRN apply a constant scaling factor, which can negatively impact performance on shorter texts. To ensure optimal performance:

Enable rope_scaling only when necessary for processing long contexts.

Adjust the factor based on your specific needs (e.g., set factor to 2.0 for a 65,536-token context)."

License

The model weights are released under the Kanana License.

Citation

@article{,
  title={Kanana-2 LLM},
  author={Kanana LLM},
  year={2025},
  url={https://huggingface.co/collections/kakaocorp/kanana-2}
}

Contact

Kanana LLM Team Technical Support: kanana-llm@kakaocorp.com
Business & Partnership Contact: alpha.k@kakaocorp.com

Downloads last month: 92

Safetensors

Model size

31B params

Tensor type

BF16

Model tree for kakaocorp/kanana-2-30b-a3b-mid-2601

Base model

kakaocorp/kanana-2-30b-a3b-base-2601

Finetuned

(1)

this model

Finetunes

2 models

Collection including kakaocorp/kanana-2-30b-a3b-mid-2601

Kanana-2

Collection

Open Source Kanana-2 • 29 items • Updated 12 days ago • 36