arxiv:2601.22674
Hanxun Yu
JonnyYu828
ยท
AI & ML interests
Multimodal LLMs, Spatial Intelligence, Embodied AI
Recent Activity
authored
a paper
1 day ago
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models
authored
a paper
1 day ago
StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding
authored
a paper
1 day ago
Physical Adversarial Attack meets Computer Vision: A Decade Survey