SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training
Abstract
SWE-Master presents a reproducible framework for developing software engineering agents through systematic optimization across multiple stages of agent development, achieving superior performance on software task resolution benchmarks.
In this technical report, we present SWE-Master, an open-source and fully reproducible post-training framework for building effective software engineering agents. SWE-Master systematically explores the complete agent development pipeline, including teacher-trajectory synthesis and data curation, long-horizon SFT, RL with real execution feedback, and inference framework design. Starting from an open-source base model with limited initial SWE capability, SWE-Master demonstrates how systematical optimization method can elicit strong long-horizon SWE task solving abilities. We evaluate SWE-Master on SWE-bench Verified, a standard benchmark for realistic software engineering tasks. Under identical experimental settings, our approach achieves a resolve rate of 61.4\% with Qwen2.5-Coder-32B, substantially outperforming existing open-source baselines. By further incorporating test-time scaling~(TTS) with LLM-based environment feedback, SWE-Master reaches 70.8\% at TTS@8, demonstrating a strong performance potential. SWE-Master provides a practical and transparent foundation for advancing reproducible research on software engineering agents. The code is available at https://github.com/RUCAIBox/SWE-Master.
Community
Unleash the SWE capabilities of the 32B model and provide available infrastructure for academic research on RL
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- SWE-World: Building Software Engineering Agents in Docker-Free Environments (2026)
- SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving (2026)
- daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently (2026)
- SWE-Universe: Scale Real-World Verifiable Environments to Millions (2026)
- Context as a Tool: Context Management for Long-Horizon SWE-Agents (2025)
- SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios (2025)
- DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper