pi05-so100-diverse / train_cloud.sh

Commit History

Retry final upload 5x, skip shutdown on failure
684902a

bot commited on

Auto-detect resume from checkpoint, save_freq 5000
44038e9

bot commited on

Remove resume (start fresh with async upload + save_freq 5000)
5594bc3

bot commited on

Add config_path for resume
66cf9d1

bot commited on

Save every 5000 steps, resume from checkpoint, async upload
3e13241

bot commited on

Add job_name fix
6d47053

bot commited on

Add short job_name to fix wandb artifact name too long
683125d

bot commited on

Batch 16, no gradient checkpointing, 340k steps (1 epoch)
148f20d

bot commited on

Re-enable gradient checkpointing (OOM without it), batch 32
fc597b1

bot commited on

Batch 32, 170k steps (1 epoch), no gradient checkpointing
4f31a1a

bot commited on

Disable gradient checkpointing to speed up training
30204fa

bot commited on

Clean bootstrap: install lerobot[pi] directly, no patches, no PYTHONPATH
4d47bd9

bot commited on

Move so100_dataset into lerobot package, remove PYTHONPATH hack
3f4427e

bot commited on

Add repo to PYTHONPATH for so100_dataset import
0f1e257

bot commited on

Update lerobot to latest with SO100 rename_map fix
a8eb6e5

bot commited on