8 8 4

Tang

Shengkun

AI & ML interests

None yet

Recent Activity

upvoted a paper about 17 hours ago

SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training

submitted a paper about 17 hours ago

SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training

upvoted a paper 3 months ago

Canzona: A Unified, Asynchronous, and Load-Balanced Framework for Distributed Matrix-based Optimizers

View all activity

Organizations

None yet

upvoted a paper about 17 hours ago

SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training

Paper • 2605.08738 • Published 4 days ago • 7

submitted a paper to Daily Papers about 17 hours ago

SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training

Paper • 2605.08738 • Published 4 days ago • 7

upvoted a paper 3 months ago

Canzona: A Unified, Asynchronous, and Load-Balanced Framework for Distributed Matrix-based Optimizers

Paper • 2602.06079 • Published Feb 4 • 21

upvoted a paper 4 months ago

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

Paper • 2601.04890 • Published Jan 8 • 44

liked a model 5 months ago

Shengkun/DarwinLM-8.4B-Pruned

Text Generation • 8B • Updated Feb 24, 2025 • 2 • 1

updated a model 7 months ago

Shengkun/llama2-gpt4-instruct

Updated Oct 13, 2025 • 1

published a model 7 months ago

Shengkun/llama2-gpt4-instruct

Updated Oct 13, 2025 • 1

updated a model 7 months ago

Shengkun/ood-detection

Updated Oct 8, 2025

published a model 7 months ago

Shengkun/ood-detection

Updated Oct 8, 2025

upvoted a collection 8 months ago

Qwen3-Next

Collection

4 items • Updated Dec 31, 2025 • 188

liked a model 9 months ago

LiqunMa/onebit-mamba2-2.7b

Updated Aug 12, 2025 • 1

updated a dataset 10 months ago

Shengkun/Raid_split

Viewer • Updated Jul 30, 2025 • 581k • 501

updated a model 12 months ago

Shengkun/Qwen3-16B-A2B-Pruned

16B • Updated May 13, 2025 • 1

published a model 12 months ago

Shengkun/Qwen3-16B-A2B-Pruned

16B • Updated May 13, 2025 • 1

updated 2 models about 1 year ago

Shengkun/DarwinLM-4.6B-Llama3.1-8B-Pruned-Masked

8B • Updated May 3, 2025 • 2

Shengkun/DarwinLM-4B-Mistral-Minitron-8B-Pruned-Masked

8B • Updated May 3, 2025 • 2

published a dataset about 1 year ago

Shengkun/Raid_split

Viewer • Updated Jul 30, 2025 • 581k • 501

upvoted a paper about 1 year ago

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

Paper • 2504.08791 • Published Apr 7, 2025 • 140

published 2 models about 1 year ago

Shengkun/DarwinLM-4B-Mistral-Minitron-8B-Pruned-Masked

8B • Updated May 3, 2025 • 2

Shengkun/DarwinLM-4.6B-Llama3.1-8B-Pruned-Masked

8B • Updated May 3, 2025 • 2

Tang

AI & ML interests

Recent Activity

Organizations

Shengkun's activity