Qwen3-VL-8B GRPO RLVR checkpoints from a token-dropout exploration study. OMR ppexplore=winner (0.714); video ~0.485 dead-heat.
Nguyen Quang Trung
ngqtrung
AI & ML interests
None yet
Recent Activity
updated a collection 1 day ago
Qwen3-VL-8B RLVR — Models (v1) updated a collection 1 day ago
Qwen3-VL-8B RLVR — Models (v1) updated a collection 1 day ago
Qwen3-VL-8B RLVR — Models (v1)Organizations
Qwen3-VL-8B RLVR — Models (v1)
Qwen3-VL-8B GRPO RLVR checkpoints from a token-dropout exploration study. OMR ppexplore=winner (0.714); video ~0.485 dead-heat.
Qwen3-VL-8B RLVR — Datasets (v1)
Curated SFT + GRPO RL datasets (video MC-QA, OMR math-image, OpenMMReasoner-RL, Vero) for Qwen3-VL-8B post-training.
models 12
ngqtrung/video-8b-grpo-ppexplore-n16k8
Image-Text-to-Text • 9B • Updated • 20
ngqtrung/video-8b-grpo-ppexplore
Image-Text-to-Text • 9B • Updated • 15
ngqtrung/video-8b-grpo-sft770
Image-Text-to-Text • 9B • Updated • 21
ngqtrung/video-8b-grpo-base
Image-Text-to-Text • 9B • Updated • 21
ngqtrung/omr-8b-grpo-base
Image-Text-to-Text • 9B • Updated • 21
ngqtrung/omr-8b-grpo-ppexplore
Image-Text-to-Text • 9B • Updated • 21
ngqtrung/verify-tool
Updated
ngqtrung/Qwen3-Omni-Thinker-30B-Instruct
Image-Text-to-Text • 32B • Updated • 4
ngqtrung/Qwen3-Omni-Thinker-30B-Thinking
Image-Text-to-Text • 32B • Updated • 2
ngqtrung/Qwen2.5-Omni-Thinker-7B
Image-Text-to-Text • 9B • Updated • 6
datasets 47
ngqtrung/vero-rl
Updated • 11
ngqtrung/openmmreasoner-rl-74k
Updated • 13
ngqtrung/omr-grpo-val
Updated • 12
ngqtrung/omr-grpo-train
Updated • 9
ngqtrung/videorl-video-val
Updated • 12
ngqtrung/videorl-video-rl-train
Updated • 12
ngqtrung/ommr-sft-recipe
Updated • 11
ngqtrung/vmar-sft-distill-raw
Updated • 12
ngqtrung/vmar-realgold-eval
Updated • 14
ngqtrung/vmar-sft-seed-v2
Updated • 14