Squeezing Capacity from Multimodal Large Language Models for Subject-driven Generation Paper • 2605.26111 • Published 8 days ago • 9
UniFusion: Vision-Language Model as Unified Encoder in Image Generation Paper • 2510.12789 • Published Oct 14, 2025 • 19