Agentic-ly agentic
updated
Automated Design of Agentic Systems
Paper
• 2408.08435
• Published
• 40
On the limits of agency in agent-based models
Paper
• 2409.10568
• Published
• 14
On the Diagram of Thought
Paper
• 2409.10038
• Published
• 13
DSBench: How Far Are Data Science Agents to Becoming Data Science
Experts?
Paper
• 2409.07703
• Published
• 66
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Paper
• 2409.08264
• Published
• 48
Paper
• 2409.07429
• Published
• 32
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized
Academic Assistance
Paper
• 2409.04593
• Published
• 26
Training Language Models to Self-Correct via Reinforcement Learning
Paper
• 2409.12917
• Published
• 140
Programming Every Example: Lifting Pre-training Data Quality like
Experts at Scale
Paper
• 2409.17115
• Published
• 64
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language
Models
Paper
• 2410.11710
• Published
• 20