MetaAgent-X: Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning
Overview
MetaAgent-X is an end-to-end reinforcement learning framework for autonomous multi-agent systems.
Unlike conventional automatic MAS methods that rely on frozen models, hand-crafted prompts, or search-based workflows, MetaAgent-X trains one shared model to both design a multi-agent system and execute it. The model learns to generate task-adaptive agent roles, collaboration structures, and execution strategies through reinforcement learning.
MetaAgent-X demonstrates strong cross-domain adaptation and achieves state-of-the-art performance across both code and math benchmarks.
Key Features
- One model for both design and execution: the same model acts as both the MAS designer and the task executor.
- End-to-end reinforcement learning: the model is optimized directly from downstream task outcomes.
- Autonomous multi-agent system generation: the model learns to construct and execute agent swarms for complex reasoning tasks.
- Cross-domain generalization: strong performance on both coding and mathematical reasoning benchmarks.
Results
The following table reports the performance of MetaAgent-XRL.
Numbers in parentheses denote absolute gains over the single-agent baseline.
| Domain | Benchmark | MetaAgent-XRL |
|---|---|---|
| Code | LiveCodeBench | 41.00 |
| Code | APPS | 38.00 |
| Code | CodeContests | 17.00 |
| Math | AIME24 | 40.00 |
| Math | AIME25 | 33.33 |
| Math | OlympiadBench | 61.00 |
| Overall | Average | 38.33 |
Citation
Coming soon.
- Downloads last month
- 34