Let ViT Speak: Generative Language-Image Pre-training Paper • 2605.00809 • Published 7 days ago • 28
HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness Paper • 2605.02396 • Published 4 days ago • 16
Large Language Models Explore by Latent Distilling Paper • 2604.24927 • Published 11 days ago • 72
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 5 days ago • 141
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows Paper • 2604.28139 • Published 8 days ago • 39
Leveraging Verifier-Based Reinforcement Learning in Image Editing Paper • 2604.27505 • Published 8 days ago • 56
Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling Paper • 2604.27039 • Published 9 days ago • 24
Experience Transfer for Multimodal LLM Agents in Minecraft Game Paper • 2604.05533 • Published Apr 7 • 15
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU Paper • 2604.05091 • Published Apr 6 • 45
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence? Paper • 2604.03016 • Published Apr 3 • 37
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate Paper • 2504.19874 • Published Apr 28, 2025 • 34
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook Paper • 2604.02029 • Published Apr 2 • 147
google/gemma-4-31B-it Image-Text-to-Text • 33B • Updated about 22 hours ago • 8.59M • • 2.56k