LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence Paper • 2605.25979 • Published 3 days ago • 21
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published 2 days ago • 105
Eagle Collection Eagle is a family of frontier vision-language models with data-centric strategies. The model supports both HD image and long-context video input. • 17 items • Updated 1 day ago • 44
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published 2 days ago • 105
Running on Zero Agents 20 LocateAnything 💬 20 Locate objects in images and videos with visual tags
Running on Zero Agents 20 LocateAnything 💬 20 Locate objects in images and videos with visual tags