Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper β’ 2604.22748 β’ Published 6 days ago β’ 212
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 23 days ago β’ 59
view article Article TRL v1.0: Post-Training Library Built to Move with the Field +2 about 1 month ago β’ 50
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper β’ 2602.05400 β’ Published Feb 5 β’ 352
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 Feb 4 β’ 89
view article Article Introducing Daggr: Chain apps programmatically, inspect visually +3 Jan 29 β’ 107
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs Paper β’ 2601.17058 β’ Published Jan 22 β’ 190
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents Paper β’ 2601.16746 β’ Published Jan 23 β’ 91
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper β’ 2601.09667 β’ Published Jan 14 β’ 93
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper β’ 2601.06943 β’ Published Jan 11 β’ 216
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper β’ 2512.04324 β’ Published Dec 3, 2025 β’ 159
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper β’ 2512.02556 β’ Published Dec 2, 2025 β’ 267