arxiv:2505.04410
Junjie Wang
xiaomoguhzz
AI & ML interests
computer vision, Vision-Language Models, Multimodal Large Language Models
Recent Activity
upvoted a paper about 3 hours ago
UnityShots: Memory-Driven Multi-Shot Audio-Video Generation with Boundary-Aware Gating updated a dataset 2 days ago
xiaomoguhzz/catseg_detectron2_data updated a model 3 days ago
xiaomoguhzz/VisionEncoder