Monitor script: show step time, games/sec, ETA from synced metrics 5a4ed63 thomas-schweich commited on about 16 hours ago
Fix monitor script to show per-variant metrics from pod 8e86dac thomas-schweich commited on about 16 hours ago
Fix SSH: generate host keys, use 'ip' field from runpodctl a47b56d thomas-schweich commited on about 16 hours ago
Remove hardcoded IP from monitor script, resolve SSH via runpodctl 660f2d0 thomas-schweich commited on about 16 hours ago
Add post-training evals, /dev/shm checkpoints, async HF push, and _orig_mod fix 87b2fa6 thomas-schweich commited on about 17 hours ago