From Correctness to Utility: Gain-Based Prefix Evaluation for LLM Reasoning Paper • 2606.07190 • Published 7 days ago • 21
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Paper • 2605.30611 • Published 15 days ago • 192
Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention Paper • 2605.29548 • Published 15 days ago • 11
KhaledReda/all-MiniLM-L6-test_model-pair_score Sentence Similarity • 22.7M • Updated 9 days ago • 37 • 1
RiT: Vanilla Diffusion Transformers Suffice in Representation Space Paper • 2605.21981 • Published 22 days ago • 10