Grad2Reward: From Sparse Judgment to Dense Rewards for Improving Open-Ended LLM Reasoning Paper • 2602.01791 • Published Feb 2 • 1
view article Article Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs +3 3 days ago • 18
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 4 days ago • 39
ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning Paper • 2603.05863 • Published Mar 6 • 6
InCoder-32B-Thinking: Industrial Code World Model for Thinking Paper • 2604.03144 • Published 9 days ago • 224
VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Paper • 2604.02486 • Published 10 days ago • 9