Li Dong

unilm

AI & ML interests

Language Model Pre-Training

Recent Activity

upvoted an article 16 days ago

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

liked a model about 1 month ago

microsoft/VibeVoice-AcousticTokenizer

liked a model about 1 month ago

kugelaudio/kugelaudio-0-open

View all activity

Organizations

authored 5 papers about 2 months ago

LLM-in-Sandbox Elicits General Agentic Intelligence

Paper • 2601.16206 • Published Jan 22 • 85

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

Paper • 2412.07067 • Published Dec 10, 2024

submitted a paper to Daily Papers about 2 months ago

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Paper • 2601.08808 • Published Jan 13 • 39

authored a paper 4 months ago

Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published Nov 13, 2025 • 52

authored 6 papers 5 months ago

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Paper • 2509.22613 • Published Sep 26, 2025 • 10

DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published Oct 13, 2025 • 27

Information-Preserving Reformulation of Reasoning Traces for Antidistillation

Paper • 2510.11545 • Published Oct 13, 2025 • 2

BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 59

Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs

Paper • 2510.24514 • Published Oct 28, 2025 • 22

The Era of Agentic Organization: Learning to Organize with Language Models

Paper • 2510.26658 • Published Oct 30, 2025 • 29

authored 2 papers 6 months ago

AdaPrompt: Adaptive Model Training for Prompt-based NLP

Paper • 2202.04824 • Published Feb 10, 2022

Thinking Augmented Pre-training

Paper • 2509.20186 • Published Sep 24, 2025 • 24

authored 4 papers 7 months ago

SeerAttention-R: Sparse Attention Adaptation for Long Reasoning

Paper • 2506.08889 • Published Jun 10, 2025 • 23

Model as a Game: On Numerical and Spatial Consistency for Generative Games

Paper • 2503.21172 • Published Mar 27, 2025

Data Efficacy for Language Model Training

Paper • 2506.21545 • Published Jun 26, 2025 • 11

VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 143

authored a paper 9 months ago

Think Only When You Need with Large Hybrid-Reasoning Models

Paper • 2505.14631 • Published May 20, 2025 • 20

Li Dong

AI & ML interests

Recent Activity

Organizations

unilm's activity