Pi0.5 IsaacLab Multi-Task 1 Epoch

This repository contains a Pi0.5 policy fine-tuned with LeRobot on the IsaacLab SO-101 multi-task dataset CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi.

Model Details

  • Base model: lerobot/pi05_base
  • Policy type: pi05
  • Training type: full fine-tuning
  • Vision encoder frozen: no
  • Action expert only: no
  • Checkpoint: final checkpoint at step 13761
  • Training length: 1.00 epoch
  • Precision: bfloat16
  • Format: safetensors

Dataset

  • Dataset: CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi
  • Robot: SO-101 follower
  • Episodes: 3300
  • Frames: 3,522,774
  • Tasks: 800
  • FPS: 30
  • Visual inputs: observation.images.top, observation.images.left_wrist
  • State/action dimensions: 6 DoF robot state/action, padded by the Pi0.5 policy configuration as needed

Training Hyperparameters

Setting Value
Steps 13761
Epochs 1.00
Per-device batch size 16
GPUs 2
Gradient accumulation 8
Effective batch size 256
Mixed precision bf16
Policy dtype bfloat16
Chunk size 16
Action steps 16
Gradient checkpointing true
Compile model false
DataLoader workers 8
DataLoader prefetch factor 2
Persistent workers true
Pin memory true
Preprocess in workers true
DDP find unused parameters true
Seed 1000

Optimizer and Scheduler

Setting Value
Optimizer AdamW
Learning rate 2.5e-5
Betas [0.9, 0.95]
Epsilon 1e-8
Weight decay 0.01
Gradient clip norm 1.0
Scheduler cosine decay with warmup
Configured warmup steps 1000
Effective warmup steps 458
Configured decay steps 30000
Effective decay steps 13761
Final decay LR 2.5e-6

The scheduler was automatically scaled because num_training_steps=13761 was smaller than the configured num_decay_steps=30000.

Final Training Log Snapshot

The final logged training metrics near completion were:

  • step=13760/13761
  • epoch=1.00
  • loss=0.009
  • grad_norm=0.259
  • lr=2.5e-06
  • updt_s=1.658
  • data_s=0.017

Training completed successfully on 2026-05-13 18:37:47 UTC.

Files

This repository includes only the inference/evaluation policy files from pretrained_model:

  • config.json
  • model.safetensors
  • train_config.json
  • policy_preprocessor.json
  • policy_preprocessor_step_2_normalizer_processor.safetensors
  • policy_postprocessor.json
  • policy_postprocessor_step_0_unnormalizer_processor.safetensors

Optimizer state and other resumable training-state files are intentionally excluded.

Evaluation Status

No rollout or task-success evaluation metrics are included yet. This checkpoint is intended as a reproducible 1-epoch Pi0.5 fine-tuning artifact for IsaacLab SO-101 multi-task experiments.

Reproducibility

Training was launched from the AutoDataCollector LeRobot workspace using the Pi0.5 IsaacLab training script configuration corresponding to:

DATASET_REPO_ID=CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi POLICY_PATH=lerobot/pi05_base BATCH_SIZE=16 GRADIENT_ACCUMULATION_STEPS=8 NUM_GPUS=2 STEPS=13761 MIXED_PRECISION=bf16 POLICY_DTYPE=bfloat16 CHUNK_SIZE=16 N_ACTION_STEPS=16 GRADIENT_CHECKPOINTING=true FREEZE_VISION_ENCODER=false TRAIN_EXPERT_ONLY=false NUM_WORKERS=8 DATALOADER_PREFETCH_FACTOR=2 DATALOADER_PERSISTENT_WORKERS=true DATALOADER_PIN_MEMORY=true PREPROCESS_IN_WORKERS=true OPTIMIZER_LR=2.5e-5 OPTIMIZER_WEIGHT_DECAY=0.01 OPTIMIZER_GRAD_CLIP_NORM=1.0 SCHEDULER_WARMUP_STEPS=1000 SCHEDULER_DECAY_STEPS=30000 SCHEDULER_DECAY_LR=2.5e-6 ./lerobot/scripts/train_pi05_isaaclab.sh
Downloads last month
16
Safetensors
Model size
4B params
Tensor type
F32
·
BF16
·
Video Preview
loading

Model tree for CoRL2026-CSI/Pi0.5-IsaacLab-Multi-Task-1epochs

Finetuned
(35)
this model

Dataset used to train CoRL2026-CSI/Pi0.5-IsaacLab-Multi-Task-1epochs