RoPE Distinguishes Neither Positions Nor Tokens in Long Contexts, Provably Paper • 2605.15514 • Published 6 days ago
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published Dec 19, 2024 • 39
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23, 2024 • 78
SciCode: A Research Coding Benchmark Curated by Scientists Paper • 2407.13168 • Published Jul 18, 2024 • 17