Philip-MIT commited on
Commit
972a577
·
verified ·
1 Parent(s): 8651f5a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -44,6 +44,7 @@ The recommended interface for inference is [RewardGen](https://github.com/Philip
44
 
45
  from rewardgen import generate, video_plot
46
 
 
47
  video_paths = [
48
  "test_videos/robosuite/lift/unsuccessful/robosuite_lift_episode_12_unsuccessful_max_reward_38.mp4"
49
  ]
@@ -110,7 +111,11 @@ Downstream systems should parse the numeric value inside `<answer>...</answer>`
110
 
111
  The model was trained on the [SOLE-R1-8B](https://huggingface.co/Philip-MIT/SOLE-R1-8B) training dataset.
112
 
113
- The dataset contains robot task progress examples with images, prompts, reasoning completions, and progress labels. The full dataset is approximately 2TB.
 
 
 
 
114
 
115
  Streaming example:
116
 
 
44
 
45
  from rewardgen import generate, video_plot
46
 
47
+ # test_videos provided at the github repo: https://github.com/Philip-MIT/rewardgen
48
  video_paths = [
49
  "test_videos/robosuite/lift/unsuccessful/robosuite_lift_episode_12_unsuccessful_max_reward_38.mp4"
50
  ]
 
111
 
112
  The model was trained on the [SOLE-R1-8B](https://huggingface.co/Philip-MIT/SOLE-R1-8B) training dataset.
113
 
114
+ The dataset contains robot task progress examples with images, prompts, reasoning completions, and progress labels.
115
+
116
+ It also includes a diverse collection of general spatial and multi-frame temporal reasoning data (e.g., from SSR-CoT, SpatialVLM, Spot-the-diff, Embodied CoT, RoboVQA, Robo2VLM-Reasoning) to serve as a foundational layer of our training mixture.
117
+
118
+ The full dataset is approximately 2TB.
119
 
120
  Streaming example:
121