DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents Paper • 2602.07035 • Published 9 days ago • 25
view article Article From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output 4 days ago • 18
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published 3 days ago • 57
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads Paper • 2602.09443 • Published 2 days ago • 51
Self-Improving Multilingual Long Reasoning via Translation-Reasoning Integrated Training Paper • 2602.05940 • Published 6 days ago • 17
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions Paper • 2602.05843 • Published 6 days ago • 55
SAGE: Benchmarking and Improving Retrieval for Deep Research Agents Paper • 2602.05975 • Published 6 days ago • 12
Multimodal Doc Models – Iterations up to [Feb 2026] Collection Iterations of my doc ocr models timeline of continual training built on top of qwen-vl models. • 8 items • Updated 6 days ago • 2
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 269
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published 7 days ago • 91
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 7 days ago • 30
SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation Paper • 2602.02402 • Published 9 days ago • 31
view article Article The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+ 8 days ago • 37
view article Article TruthTensor: LLM Evalution in Prediction Markets Under Drift and Market Baseline 13 days ago • 18
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report Paper • 2601.21051 • Published 14 days ago • 12