Demo of the Qwen 3.5 Multimodal Model
Detect and label objects in images and videos
Extract text from images and PDFs