Squeezing Capacity from Multimodal Large Language Models for Subject-driven Generation Paper • 2605.26111 • Published 4 days ago • 6
UniFusion: Vision-Language Model as Unified Encoder in Image Generation Paper • 2510.12789 • Published Oct 14, 2025 • 19
UniFusion: Vision-Language Model as Unified Encoder in Image Generation Paper • 2510.12789 • Published Oct 14, 2025 • 19