Balanced Aggregation: Understanding and Fixing Aggregation Bias in GRPO Paper • 2605.04077 • Published 25 days ago • 2
AI Co-Mathematician: Accelerating Mathematicians with Agentic AI Paper • 2605.06651 • Published 2 days ago • 5
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 2 days ago • 53
Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key Paper • 2605.06638 • Published 2 days ago • 8
SkillOS: Learning Skill Curation for Self-Evolving Agents Paper • 2605.06614 • Published 2 days ago • 21
A^2TGPO: Agentic Turn-Group Policy Optimization with Adaptive Turn-level Clipping Paper • 2605.06200 • Published 2 days ago • 8
MiA-Signature: Approximating Global Activation for Long-Context Understanding Paper • 2605.06416 • Published 2 days ago • 37
Auto Research with Specialist Agents Develops Effective and Non-Trivial Training Recipes Paper • 2605.05724 • Published 2 days ago • 10
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction Paper • 2604.27393 • Published 9 days ago • 42
Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning Paper • 2605.02913 • Published about 1 month ago • 6
Skills-Coach: A Self-Evolving Skill Optimizer via Training-Free GRPO Paper • 2604.27488 • Published 9 days ago • 4
ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration Paper • 2605.03042 • Published 5 days ago • 97
MedSkillAudit: A Domain-Specific Audit Framework for Medical Research Agent Skills Paper • 2604.20441 • Published 17 days ago • 2
Lightning Unified Video Editing via In-Context Sparse Attention Paper • 2605.04569 • Published 3 days ago • 12
Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation Paper • 2605.04128 • Published 4 days ago • 10
Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems Paper • 2605.04018 • Published 4 days ago • 28
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 3 days ago • 88