โ 295B total / 21B active / 256K context โ Fused fast-and-slow thinking in a single model โ First model trained on Hunyuan's rebuilt pretraining + RL infra (Feb โ Apr)
Benchmarks: ๐ SWE-Bench Verified, Terminal-Bench 2.0, BrowseComp, WideSearch โ competitive results, particularly strong on agentic tool use ๐ Top score on Tsinghua's 2026 Spring math PhD qualifying exam ๐ Strong context-learning and instruction-following on Tencent's CL-bench / CL-bench-Life
Our lab recently released a paper where we introduce ShadowPEFT, a new Parameter-Efficient Fine-Tuning (PEFT) paradigm tailored for edge computing scenarios.
Unlike traditional approaches such as LoRA and its variants, which inject trainable parameters directly into the weights of Transformer, requiring tight coupling with the backbone.
ShadowPEFT instead enhances the frozen large base model by adding a lightweight, centralized, pretrainable, and detachable Shadow network. This shadow network operates in parallel with the base model, delivering learned corrections to each decoder layer. Because the shadow module is architecturally decoupled from the backbone, it can be independently trained, stored, and deployed, benefiting edge computing scenarios and edge-cloud collaboration computing.