Spaces:
Running on A10G
Running on A10G
Training on Lightning AI
This guide explains how to run CommitGuard GRPO training on a Lightning AI GPU Studio.
Recommended Instance
- GPU: NVIDIA L4 (24GB) or A10G (24GB) is sufficient for Llama-3.2-3B with Unsloth 4-bit.
- Image: Default Linux / PyTorch images are fine; the setup script handles dependencies.
Setup & Train in One Step
- Open a terminal in your Lightning AI Studio.
- Run the setup script:
bash scripts/lightning_setup.sh
What the script does:
- Installs
uvfor fast dependency management. - Creates a virtual environment and installs all requirements (Unsloth, TRL, etc.).
- Starts the
commitguard_envserver in the background (viatmuxif available). - Runs
scripts/train_grpo.py.
Manual Steps (Optional)
1. View Training Logs
If you want to see the environment server logs:
tmux attach -t env_server
(Press Ctrl+B, then D to detach).
2. Hugging Face Integration
To save your model to the Hugging Face Hub, login before training:
huggingface-cli login
3. Checkpoints
Checkpoints and the final merged LoRA adapter will be saved to:
outputs/commitguard-llama-3b/final
Troubleshooting
- OOM Error: If you hit Out-Of-Memory, try reducing
--batch-sizeor--num-generationsinscripts/train_grpo.py. - Server Connection: If training fails with connection errors, ensure the server started correctly by checking
curl http://localhost:8000/health.