AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
WoVR: World Models as Reliable Simulators for Post-Training VLA Policies with RL
RLinf-Co: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models
RL-based co-training in RLinf
-
RLinf/RLinf-Pi05-RLCo-PandaPutOnPlateInScene25DigitalTwin-V1-SFT
4B • Updated -
RLinf-Co: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models
Paper • 2602.12628 • Published • 12 -
RLinf/RLCo-Example-Mix-Data
Viewer • Updated • 3.1k • 4.51k -
RLinf/RLCo-Example-Real-Data
Viewer • Updated • 100 • 366
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning
-
RLinf/RLinf-OpenVLAOFT-LIBERO-130-Base-Lora
Reinforcement Learning • 8B • Updated • 49 -
RLinf/RLinf-OpenVLAOFT-ManiSkill-Base-Main
8B • Updated • 85 -
RLinf/RLinf-OpenVLAOFT-LIBERO-90-Base-Lora
Reinforcement Learning • 8B • Updated • 36 -
RLinf/RLinf-OpenVLAOFT-LIBERO-130
Reinforcement Learning • 8B • Updated • 745 • 3
MetaWorld
-
RLinf/RLinf-OpenSora-LIBERO-Spatial
1B • Updated • 18 -
RLinf/RLinf-OpenVLAOFT-LIBERO-90-Base-Lora
Reinforcement Learning • 8B • Updated • 36 -
RLinf/RLinf-OpenVLAOFT-GRPO-LIBERO-object
Reinforcement Learning • 8B • Updated • 3 -
RLinf/RLinf-OpenVLAOFT-GRPO-LIBERO-goal
Reinforcement Learning • 8B • Updated • 2
MetaWorld
RL-based co-training in RLinf
-
RLinf/RLinf-Pi05-RLCo-PandaPutOnPlateInScene25DigitalTwin-V1-SFT
4B • Updated -
RLinf-Co: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models
Paper • 2602.12628 • Published • 12 -
RLinf/RLCo-Example-Mix-Data
Viewer • Updated • 3.1k • 4.51k -
RLinf/RLCo-Example-Real-Data
Viewer • Updated • 100 • 366
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning
-
RLinf/RLinf-OpenVLAOFT-LIBERO-130-Base-Lora
Reinforcement Learning • 8B • Updated • 49 -
RLinf/RLinf-OpenVLAOFT-ManiSkill-Base-Main
8B • Updated • 85 -
RLinf/RLinf-OpenVLAOFT-LIBERO-90-Base-Lora
Reinforcement Learning • 8B • Updated • 36 -
RLinf/RLinf-OpenVLAOFT-LIBERO-130
Reinforcement Learning • 8B • Updated • 745 • 3
-
RLinf/RLinf-OpenSora-LIBERO-Spatial
1B • Updated • 18 -
RLinf/RLinf-OpenVLAOFT-LIBERO-90-Base-Lora
Reinforcement Learning • 8B • Updated • 36 -
RLinf/RLinf-OpenVLAOFT-GRPO-LIBERO-object
Reinforcement Learning • 8B • Updated • 3 -
RLinf/RLinf-OpenVLAOFT-GRPO-LIBERO-goal
Reinforcement Learning • 8B • Updated • 2