Emu3.5: Native Multimodal Models are World Learners Paper • 2510.26583 • Published Oct 30, 2025 • 114
Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning Paper • 2603.16189 • Published 6 days ago • 10
Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning Paper • 2603.16189 • Published 6 days ago • 10
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint Paper • 2502.16770 • Published Feb 24, 2025 • 1
Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance Paper • 2601.14171 • Published Jan 20 • 53
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published Jan 20 • 26
AnimateScene: Camera-controllable Animation in Any Scene Paper • 2508.05982 • Published Aug 8, 2025 • 1
Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance Paper • 2601.14171 • Published Jan 20 • 53
Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance Paper • 2601.14171 • Published Jan 20 • 53
AgentOCR: Reimagining Agent History via Optical Self-Compression Paper • 2601.04786 • Published Jan 8 • 30
InfiniteVGGT: Visual Geometry Grounded Transformer for Endless Streams Paper • 2601.02281 • Published Jan 5 • 33
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27, 2025 • 98
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published Oct 24, 2025 • 102