MetaAgent-X: Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

Paper: Coming Soon

Codebase 🚗

Overview

MetaAgent-X is an end-to-end reinforcement learning framework for autonomous multi-agent systems.

Unlike conventional automatic MAS methods that rely on frozen models, hand-crafted prompts, or search-based workflows, MetaAgent-X trains one shared model to both design a multi-agent system and execute it. The model learns to generate task-adaptive agent roles, collaboration structures, and execution strategies through reinforcement learning.

MetaAgent-X demonstrates strong cross-domain adaptation and achieves state-of-the-art performance across both code and math benchmarks.

Key Features

  • One model for both design and execution: the same model acts as both the MAS designer and the task executor.
  • End-to-end reinforcement learning: the model is optimized directly from downstream task outcomes.
  • Autonomous multi-agent system generation: the model learns to construct and execute agent swarms for complex reasoning tasks.
  • Cross-domain generalization: strong performance on both coding and mathematical reasoning benchmarks.

Results

The following table reports the performance of MetaAgent-XRL.
Numbers in parentheses denote absolute gains over the single-agent baseline.

Domain Benchmark MetaAgent-XRL
Code LiveCodeBench 41.00
Code APPS 38.00
Code CodeContests 17.00
Math AIME24 40.00
Math AIME25 33.33
Math OlympiadBench 61.00
Overall Average 38.33

Citation

Coming soon.

Downloads last month
34
Safetensors
Model size
8B params
Tensor type
BF16
·
Video Preview
loading