rob PRO
rob-x-ai
AI & ML interests
∙ Open Research
∙ Making Generative AI accessible to anyone on consumer hardware
Recent Activity
updated a Space about 19 hours ago
rob-x-ai/genesis-1b-playground liked a dataset about 21 hours ago
open-index/hacker-news posted an update 4 days ago
Genesis 1B is now public. 🔥
I’m training a 1.003B parameter model from scratch on 2× RTX 4090s and opened a public playground for early checkpoints.
The real bottleneck wasn’t training.
It was checkpointing:
FSDP full-state gather over PCIe = NCCL timeout hell
Switching to DCP sharded checkpoints changed the trajectory of the run.
- Playground: https://huggingface.co/spaces/rob-x-ai/genesis-1b-playground
- Write-up: https://kroonen.ai/blog/distributed-checkpoint-failures-rtx4090/