Pi0.5 IsaacLab Multi-Task 1 Epoch

This repository contains a Pi0.5 policy fine-tuned with LeRobot on the IsaacLab SO-101 multi-task dataset CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi.

Model Details

Base model: lerobot/pi05_base
Policy type: pi05
Training type: full fine-tuning
Vision encoder frozen: no
Action expert only: no
Checkpoint: final checkpoint at step 13761
Training length: 1.00 epoch
Precision: bfloat16
Format: safetensors

Dataset

Dataset: CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi
Robot: SO-101 follower
Episodes: 3300
Frames: 3,522,774
Tasks: 800
FPS: 30
Visual inputs: observation.images.top, observation.images.left_wrist
State/action dimensions: 6 DoF robot state/action, padded by the Pi0.5 policy configuration as needed

Training Hyperparameters

Setting	Value
Steps	`13761`
Epochs	`1.00`
Per-device batch size	`16`
GPUs	`2`
Gradient accumulation	`8`
Effective batch size	`256`
Mixed precision	`bf16`
Policy dtype	`bfloat16`
Chunk size	`16`
Action steps	`16`
Gradient checkpointing	`true`
Compile model	`false`
DataLoader workers	`8`
DataLoader prefetch factor	`2`
Persistent workers	`true`
Pin memory	`true`
Preprocess in workers	`true`
DDP find unused parameters	`true`
Seed	`1000`

Optimizer and Scheduler

Setting	Value
Optimizer	AdamW
Learning rate	`2.5e-5`
Betas	`[0.9, 0.95]`
Epsilon	`1e-8`
Weight decay	`0.01`
Gradient clip norm	`1.0`
Scheduler	cosine decay with warmup
Configured warmup steps	`1000`
Effective warmup steps	`458`
Configured decay steps	`30000`
Effective decay steps	`13761`
Final decay LR	`2.5e-6`

The scheduler was automatically scaled because num_training_steps=13761 was smaller than the configured num_decay_steps=30000.

Final Training Log Snapshot

The final logged training metrics near completion were:

step=13760/13761
epoch=1.00
loss=0.009
grad_norm=0.259
lr=2.5e-06
updt_s=1.658
data_s=0.017

Training completed successfully on 2026-05-13 18:37:47 UTC.

Files

This repository includes only the inference/evaluation policy files from pretrained_model:

config.json
model.safetensors
train_config.json
policy_preprocessor.json
policy_preprocessor_step_2_normalizer_processor.safetensors
policy_postprocessor.json
policy_postprocessor_step_0_unnormalizer_processor.safetensors

Optimizer state and other resumable training-state files are intentionally excluded.

Evaluation Status

No rollout or task-success evaluation metrics are included yet. This checkpoint is intended as a reproducible 1-epoch Pi0.5 fine-tuning artifact for IsaacLab SO-101 multi-task experiments.

Reproducibility

Training was launched from the AutoDataCollector LeRobot workspace using the Pi0.5 IsaacLab training script configuration corresponding to:

DATASET_REPO_ID=CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi POLICY_PATH=lerobot/pi05_base BATCH_SIZE=16 GRADIENT_ACCUMULATION_STEPS=8 NUM_GPUS=2 STEPS=13761 MIXED_PRECISION=bf16 POLICY_DTYPE=bfloat16 CHUNK_SIZE=16 N_ACTION_STEPS=16 GRADIENT_CHECKPOINTING=true FREEZE_VISION_ENCODER=false TRAIN_EXPERT_ONLY=false NUM_WORKERS=8 DATALOADER_PREFETCH_FACTOR=2 DATALOADER_PERSISTENT_WORKERS=true DATALOADER_PIN_MEMORY=true PREPROCESS_IN_WORKERS=true OPTIMIZER_LR=2.5e-5 OPTIMIZER_WEIGHT_DECAY=0.01 OPTIMIZER_GRAD_CLIP_NORM=1.0 SCHEDULER_WARMUP_STEPS=1000 SCHEDULER_DECAY_STEPS=30000 SCHEDULER_DECAY_LR=2.5e-6 ./lerobot/scripts/train_pi05_isaaclab.sh

Downloads last month: 16

Safetensors

Model size

4B params

Tensor type

F32

BF16

Video Preview

Robotics

Model tree for CoRL2026-CSI/Pi0.5-IsaacLab-Multi-Task-1epochs

Base model

lerobot/pi05_base

Finetuned

(35)

this model

CoRL2026-CSI
/

Pi0.5-IsaacLab-Multi-Task-1epochs