Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 9 days ago • 87 • 5
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published 13 days ago • 85 • 3
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts Paper • 2602.13367 • Published 27 days ago • 31 • 3
When Models Manipulate Manifolds: The Geometry of a Counting Task Paper • 2601.04480 • Published Jan 8 • 4 • 1
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published Jan 31 • 315 • 8
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published Feb 5 • 347 • 3
Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published Dec 15, 2025 • 106 • 5
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published Feb 2 • 138 • 8
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 229 • 9
Guiding a Diffusion Transformer with the Internal Dynamics of Itself Paper • 2512.24176 • Published Dec 30, 2025 • 8 • 4