OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering Paper • 2604.08209 • Published 4 days ago • 21
Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO Paper • 2602.06422 • Published Feb 6 • 47