chen-yingfa
/

HypeNet-5B

linear-attention

Model card Files Files and versions

Links:

GitHub repo: https://github.com/thunlp/hybrid-linear-attention
Paper: https://arxiv.org/abs/2601.22156

This is the final HypeNet-5B checkpoint from the paper Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts, distilled from Qwen3-4B using the HALO pipeline proposed in our paper. For more information, please refer to our GitHub repo.

Downloads last month: 25

Safetensors

Model size

5B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for chen-yingfa/HypeNet-5B

Base model

Qwen/Qwen3-4B-Base

Finetuned

Finetuned

(586)

this model

Dataset used to train chen-yingfa/HypeNet-5B

Collection including chen-yingfa/HypeNet-5B

HypeNet

The models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts • 2 items • Updated 2 days ago

Paper for chen-yingfa/HypeNet-5B

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Paper • 2601.22156 • Published Jan 29 • 14