---
language:
- zh
- en
pipeline_tag: text-generation
---
## 1. Model Introduction
JoyAI-LLM Flash-Base is a state-of-the-art mixture-of-experts (MoE) language model with 3 billion activated parameters and 48 billion total parameters. Trained with the Muon optimizer, JoyAI Flash-base achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities. JoyAI-LLM Flash series aim to accelarate high-throughput, latency-sensitive applications where cost per query must remain minimal.
### Key Features
- Training-Inference Collaboration: apply Muon optimizer with dense MTP, develop novel optimization techniques to resolve instabilities while scaling up, delivering 1.3× to 1.7× the throughput of the non-MTP version.
- Agentic Intelligence: Specifically designed for tool use, reasoning, and autonomous problem-solving.
## 2. Model Summary
| | |
| :-----------------------------------------: | :----------------------: |
| **Architecture** | Mixture-of-Experts (MoE) |
| **Total Parameters** | 48B |
| **Activated Parameters** | 3B |
| **Number of Layers** (Dense layer included) | 40 |
| **Number of Dense Layers** | 1 |
| **Attention Hidden Dimension** | 2048 |
| **MoE Hidden Dimension** (per Expert) | 768 |
| **Number of Attention Heads** | 32 |
| **Number of Experts** | 256 |
| **Selected Experts per Token** | 8 |
| **Number of Shared Experts** | 1 |
| **Vocabulary Size** | 129K |
| **Context Length** | 128K |
| **Attention Mechanism** | MLA |
| **Activation Function** | SwiGLU |
| | |
## 3. Evaluation Results
| Benchmark |
JoyAI-LLM Flash-base |
Qwen3-30B-A3B-base |
| MMLU |
84.70 |
82.12 |
| MMLU-Pro |
73.14 |
61.76 |
| CMMLU |
83.09 |
83.60 |
| HumanEval |
85.37 |
87.80 |
| LiveCodeBench |
39.91 |
37.34 |
| GSM8K |
88.78 |
90.37 |
| MATH |
78.16 |
59.60 |
| MATH 500 |
77.00 |
58.00 |
## 4. License
Both the code repository and the model weights are released under the [Modified MIT License](LICENSE).