Wang

ShihaoW

14 3 3

AI & ML interests

None yet

Recent Activity

new activity 12 days ago

nvidia/LocateAnything-3B:Using LocateAnything-3B for real-time grounding in robotic inspection (Isaac Sim + UR5e)

new activity 17 days ago

nvidia/LocateAnything-3B:Fine-tuning with Lora

new activity 19 days ago

nvidia/LocateAnything-3B:solid model

View all activity

Organizations

New activity in nvidia/LocateAnything-3B 12 days ago

Using LocateAnything-3B for real-time grounding in robotic inspection (Isaac Sim + UR5e)

#26 opened 14 days ago by

taha-M

New activity in nvidia/LocateAnything-3B 17 days ago

Fine-tuning with Lora

#23 opened 24 days ago by

hanshupe

New activity in nvidia/LocateAnything-3B 19 days ago

solid model

#24 opened 23 days ago by

dpe1

New activity in nvidia/LocateAnything-3B 30 days ago

NotImplementedError: self._attn_implementation='flash_attention_2'

#18 opened about 1 month ago by

phamvantoan

New activity in nvidia/LocateAnything about 1 month ago

Model hallucinating on easy tasks

#3 opened about 1 month ago by

johnlockejrr

updated a model about 1 month ago

nvidia/LocateAnything-3B

Image-Text-to-Text • 4B • Updated about 1 month ago • 1.5M • 2.71k

New activity in nvidia/LocateAnything-3B about 1 month ago

This is the greatest release since GPT 3.5, 'Self Automating Machines'

#9 opened about 1 month ago by

darkmatter2222

[fastest inference] locateanything-batch — batched + KV-cached LocateAnything-3B, ~2.7× faster

👀🚀 6

#10 opened about 1 month ago by

Liuwang971

Inference support for vLLM and SGLang OpenAI endpoints

➕ 14

#3 opened about 1 month ago by

Vishva007

Batch query

#5 opened about 1 month ago by

SRai22

ComfyUI Node

#7 opened about 1 month ago by

Alissonerdx

Installation Video and Testing - Step by Step

#6 opened about 1 month ago by

fahdmirzac

upvoted a paper about 1 month ago

GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

Paper • 2605.31039 • Published May 29 • 46

New activity in nvidia/LocateAnything-3B about 1 month ago

Great model!

#1 opened about 1 month ago by

JeanJan90

liked a Space about 1 month ago

LocateAnything

💬

Detect and annotate objects in images or videos

New activity in nvidia/LocateAnything about 1 month ago

gradio server

#1 opened about 1 month ago by

akhaliq

upvoted a paper about 2 months ago

LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence

Paper • 2605.25979 • Published May 25 • 27

authored a paper about 2 months ago

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Paper • 2605.27365 • Published May 26 • 145

updated a collection about 2 months ago

Eagle

Collection

Eagle is a family of frontier vision-language models with data-centric strategies. The model supports both HD image and long-context video input. • 17 items • Updated about 1 month ago • 49