Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
GAASH-Lab
/
QTrack
like
0
Follow
GAASH
9
Video-Text-to-Text
Safetensors
multi-object-tracking
video-understanding
vision-language-model
spatiotemporal-reasoning
arxiv:
2603.13759
License:
mit
Model card
Files
Files and versions
xet
Community
main
QTrack
File size: 70 Bytes
5f1d711
1
2
3
4
5
{
"image_seq_length"
:
256
,
"processor_class"
:
"Gemma3Processor"
}