Spaces:
Running on A10G
Running on A10G
| # Training on Lightning AI | |
| This guide explains how to run CommitGuard GRPO training on a Lightning AI GPU Studio. | |
| ## Recommended Instance | |
| - **GPU:** NVIDIA L4 (24GB) or A10G (24GB) is sufficient for Llama-3.2-3B with Unsloth 4-bit. | |
| - **Image:** Default Linux / PyTorch images are fine; the setup script handles dependencies. | |
| ## Setup & Train in One Step | |
| 1. Open a terminal in your Lightning AI Studio. | |
| 2. Run the setup script: | |
| ```bash | |
| bash scripts/lightning_setup.sh | |
| ``` | |
| ## What the script does: | |
| 1. Installs `uv` for fast dependency management. | |
| 2. Creates a virtual environment and installs all requirements (Unsloth, TRL, etc.). | |
| 3. Starts the `commitguard_env` server in the background (via `tmux` if available). | |
| 4. Runs `scripts/train_grpo.py`. | |
| ## Manual Steps (Optional) | |
| ### 1. View Training Logs | |
| If you want to see the environment server logs: | |
| ```bash | |
| tmux attach -t env_server | |
| ``` | |
| (Press `Ctrl+B`, then `D` to detach). | |
| ### 2. Hugging Face Integration | |
| To save your model to the Hugging Face Hub, login before training: | |
| ```bash | |
| huggingface-cli login | |
| ``` | |
| ### 3. Checkpoints | |
| Checkpoints and the final merged LoRA adapter will be saved to: | |
| `outputs/commitguard-llama-3b/final` | |
| ## Troubleshooting | |
| - **OOM Error:** If you hit Out-Of-Memory, try reducing `--batch-size` or `--num-generations` in `scripts/train_grpo.py`. | |
| - **Server Connection:** If training fails with connection errors, ensure the server started correctly by checking `curl http://localhost:8000/health`. | |