GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published 1 day ago • 37
DFlash Collection Block Diffusion for Flash Speculative Decoding • 15 items • Updated 5 days ago • 88
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 8 days ago • 235
DFlash: Block Diffusion for Flash Speculative Decoding Paper • 2602.06036 • Published Feb 5 • 69
Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music Paper • 2604.10905 • Published 17 days ago • 28
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention Paper • 2603.28458 • Published about 1 month ago • 43
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward Paper • 2603.26599 • Published Mar 27 • 65
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 28 days ago • 882
RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation Paper • 2603.25804 • Published Mar 26 • 29
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published Mar 6 • 48
A Subgoal-driven Framework for Improving Long-Horizon LLM Agents Paper • 2603.19685 • Published Mar 20 • 21
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Paper • 2603.17024 • Published Mar 17 • 109