Instructions to use CoRL2026-CSI/SmolVLA-CaP-StackBlock-50epochs with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use CoRL2026-CSI/SmolVLA-CaP-StackBlock-50epochs with LeRobot:
# See https://github.com/huggingface/lerobot?tab=readme-ov-file#installation for more details git clone https://github.com/huggingface/lerobot.git cd lerobot pip install -e .[smolvla]
# Launch finetuning on your dataset python lerobot/scripts/train.py \ --policy.path=CoRL2026-CSI/SmolVLA-CaP-StackBlock-50epochs \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true
# Run the policy using the record function python -m lerobot.record \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM0 \ # <- Use your port --robot.id=my_blue_follower_arm \ # <- Use your robot id --robot.cameras="{ front: {type: opencv, index_or_path: 8, width: 640, height: 480, fps: 30}}" \ # <- Use your cameras --dataset.single_task="Grasp a lego block and put it in the bin." \ # <- Use the same task description you used in your dataset recording --dataset.repo_id=HF_USER/dataset_name \ # <- This will be the dataset name on HF Hub --dataset.episode_time_s=50 \ --dataset.num_episodes=10 \ --policy.path=CoRL2026-CSI/SmolVLA-CaP-StackBlock-50epochs - Notebooks
- Google Colab
- Kaggle
SmolVLA-CaP-StackBlock-50epochs
This repository contains a SmolVLA policy fine-tuned with LeRobot for the SO101 CAP task Stack RGB Blocks on a Blue Dish. The policy was initialized from CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep and trained for 50 epochs on CoRL2026-CSI/SO101-cap_stack_RGBblock_on_bluedish_10fps.
Model Details
| Field | Value |
|---|---|
| Policy type | smolvla |
| Task | stack red, green, and blue blocks on the blue dish from bottom to top |
| Robot | SO101 follower |
| Dataset | CoRL2026-CSI/SO101-cap_stack_RGBblock_on_bluedish_10fps |
| Base model | CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep |
| Training steps | 17100 |
| Completed step | 17100 |
| Batch size | 128 per GPU |
| Effective batch size | 256 |
| Action chunk size | 50 |
| Action horizon | 50 |
| Observation steps | 1 |
| Inference denoising steps | 50 |
| Model weights | model.safetensors (864.7 MiB) |
Training Setup
The run used two CUDA processes with batch_size=128 per process, image augmentation enabled, and camera key remapping from the dataset's raw cameras to the SmolVLA camera names:
observation.images.left_wrist -> observation.images.camera1
observation.images.top -> observation.images.camera2
The checkpoint was saved locally at step 17100 with LeRobot's preprocessor and postprocessor artifacts included in this repository.
Files
model.safetensors
config.json
train_config.json
policy_preprocessor.json
policy_preprocessor_step_5_normalizer_processor.safetensors
policy_postprocessor.json
policy_postprocessor_step_0_unnormalizer_processor.safetensors
Usage
from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy
policy = SmolVLAPolicy.from_pretrained("CoRL2026-CSI/SmolVLA-CaP-StackBlock-50epochs")
For robot deployment, use the same camera mapping, normalization pipeline, and SO101 action/state conventions used by the training dataset.
Intended Use
This model is intended for imitation-learning experiments and SO101 tabletop manipulation research on the specified CAP task. It is not a general-purpose robot policy and should be validated in a controlled workspace before any hardware deployment.
Limitations
The model was trained on a single task dataset with fixed camera views, object set, action space, and workspace assumptions. No official evaluation success rate is included in this repository.
- Downloads last month
- 13
Model tree for CoRL2026-CSI/SmolVLA-CaP-StackBlock-50epochs
Base model
lerobot/smolvla_base