# Training on Lightning AI This guide explains how to run CommitGuard GRPO training on a Lightning AI GPU Studio. ## Recommended Instance - **GPU:** NVIDIA L4 (24GB) or A10G (24GB) is sufficient for Llama-3.2-3B with Unsloth 4-bit. - **Image:** Default Linux / PyTorch images are fine; the setup script handles dependencies. ## Setup & Train in One Step 1. Open a terminal in your Lightning AI Studio. 2. Run the setup script: ```bash bash scripts/lightning_setup.sh ``` ## What the script does: 1. Installs `uv` for fast dependency management. 2. Creates a virtual environment and installs all requirements (Unsloth, TRL, etc.). 3. Starts the `commitguard_env` server in the background (via `tmux` if available). 4. Runs `scripts/train_grpo.py`. ## Manual Steps (Optional) ### 1. View Training Logs If you want to see the environment server logs: ```bash tmux attach -t env_server ``` (Press `Ctrl+B`, then `D` to detach). ### 2. Hugging Face Integration To save your model to the Hugging Face Hub, login before training: ```bash huggingface-cli login ``` ### 3. Checkpoints Checkpoints and the final merged LoRA adapter will be saved to: `outputs/commitguard-llama-3b/final` ## Troubleshooting - **OOM Error:** If you hit Out-Of-Memory, try reducing `--batch-size` or `--num-generations` in `scripts/train_grpo.py`. - **Server Connection:** If training fails with connection errors, ensure the server started correctly by checking `curl http://localhost:8000/health`.