Unified Multimodal Autoregressive Modeling with Shared Context-Visual Tokenizer is Key to Unification Paper • 2606.18249 • Published 2 days ago • 9
UniAR Collection Model checkpoints for UniAR: Unified Multimodal Autoregressive Modeling with Shared Context—Visual Tokenizer is Key to Unification. • 2 items • Updated 1 day ago
ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations Paper • 2606.11188 • Published 9 days ago • 26
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published 21 days ago • 142
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published May 12 • 191
CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization Paper • 2603.06449 • Published Mar 6 • 6
RoboOmni: Proactive Robot Manipulation in Omni-modal Context Paper • 2510.23763 • Published Oct 27, 2025 • 62
LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models Paper • 2510.13626 • Published Oct 15, 2025 • 48
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning Paper • 2508.20751 • Published Aug 28, 2025 • 90
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment Paper • 2505.18600 • Published May 24, 2025 • 49
CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models Paper • 2505.12504 • Published May 18, 2025 • 24