Philip-MIT commited on
Commit
84cdb4d
·
verified ·
1 Parent(s): 972a577

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -14,6 +14,8 @@ datasets:
14
  - Philip-MIT/sole_training_data
15
  ---
16
 
 
 
17
  # SOLE-R1-8B
18
 
19
  SOLE-R1-8B is a video-language reward reasoning model for robotics. It is designed to estimate task progress from robot video frames and a natural-language task description, producing both per-timestep reasoning traces and scalar progress predictions that can be used as rewards for online robot reinforcement learning.
 
14
  - Philip-MIT/sole_training_data
15
  ---
16
 
17
+ https://cdn-uploads.huggingface.co/production/uploads/6a13185f0f5f7894f043e8d7/AwncGGaDE0IUKqccU6S22.mp4
18
+
19
  # SOLE-R1-8B
20
 
21
  SOLE-R1-8B is a video-language reward reasoning model for robotics. It is designed to estimate task progress from robot video frames and a natural-language task description, producing both per-timestep reasoning traces and scalar progress predictions that can be used as rewards for online robot reinforcement learning.