| # Free Deployment Guide for Stack 2.9 |
|
|
| This guide covers deploying Stack 2.9 on free-tier platforms. |
|
|
| --- |
|
|
| ## Option 1: HuggingFace Spaces (Free Inference) |
|
|
| ### Step 1: Create Space |
| ```bash |
| # Go to https://huggingface.co/spaces and create new Space |
| # Choose: Docker, Python 3.11, Small (2CPU 4GB) |
| ``` |
|
|
| ### Step 2: Push Your Model |
| ```bash |
| # Upload your fine-tuned model to HF |
| from huggingface_hub import HfApi |
| api = HfApi() |
| api.upload_folder( |
| folder_path="./stack-2.9-7b", |
| repo_id="yourusername/stack-2.9", |
| repo_type="model" |
| ) |
| ``` |
|
|
| ### Step 3: Configure API URL |
| Set environment variable in Space: |
| - `API_URL`: Your model inference URL |
| - `HF_TOKEN`: Your HF token |
|
|
| ### Step 4: Deploy |
| ```bash |
| # Clone Space and push files |
| git clone https://huggingface.co/spaces/yourusername/stack-2.9 |
| cp deploy/hfSpaces/* . |
| git add . && git push |
| ``` |
|
|
| --- |
|
|
| ## Option 2: Together AI Fine-tuning (Free Credits) |
|
|
| ### Free Tier Limits |
| - Up to 7B model fine-tuning |
| - Limited training minutes (varies by promotion) |
| - Requires: Together AI account |
|
|
| ### Setup |
| ```bash |
| # Get API key from https://together.ai |
| export TOGETHER_API_KEY="your-key" |
| |
| # Fine-tune 7B model (free-tier friendly) |
| python stack/training/together_finetune.py \ |
| --model 7b \ |
| --data data/final/train.jsonl \ |
| --epochs 3 |
| ``` |
|
|
| ### Use Fine-tuned Model |
| ```python |
| from together import Together |
| |
| client = Together(api_key="your-key") |
| |
| response = client.chat.completions.create( |
| model="your-finetuned-model", |
| messages=[{"role": "user", "content": "Write a function"}] |
| ) |
| ``` |
|
|
| --- |
|
|
| ## Option 3: Google Colab (Free Training) |
|
|
| ### Run Training |
| ```python |
| # Open colab_train_stack29.ipynb in Google Colab |
| # Select GPU runtime (free tier: T4 15GB) |
| |
| # For 7B model (runs on free tier): |
| batch_size = 2 # Reduce for 15GB VRAM |
| gradient_accumulation = 8 |
| ``` |
|
|
| ### Model Sizes for Free Tier |
| | Model | VRAM Needed | Free Tier? | |
| |-------|-------------|------------| |
| | 1.5B | ~4GB | β
Yes | |
| | 3B | ~8GB | β
Yes (T4) | |
| | 7B | ~16GB | β οΈ Limited | |
| | 32B | ~64GB | β No | |
|
|
| --- |
|
|
| ## Option 4: RunPod / Vast.ai (Cheap, Not Free) |
|
|
| ### Quick Start |
| ```bash |
| # Deploy on RunPod (~$0.20/hour for A100) |
| cd stack/deploy |
| ./runpod_deploy.sh --template runpod-template.json |
| |
| # Deploy on Vast.ai (~$0.15/hour) |
| ./vastai_deploy.sh --template vastai-template.json |
| ``` |
|
|
| --- |
|
|
| ## Recommended Free Stack |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββ |
| β Stack 2.9 Free Deployment Stack β |
| βββββββββββββββββββββββββββββββββββββββββββββββ€ |
| β Model: Qwen2.5-Coder-7B β |
| β Fine-tune: Together AI (free credits) β |
| β Deploy: HuggingFace Spaces (free) β |
| β UI: Gradio (included in Spaces) β |
| βββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ## Cost Comparison |
|
|
| | Platform | Cost | What's Free | |
| |----------|------|-------------| |
| | HF Spaces | $0 | 2CPU 4GB hosting | |
| | Together AI | varies | Fine-tuning credits | |
| | Colab | $0 | ~0.5hr GPU/day | |
| | RunPod | $0.20/hr | First $10 credit | |
| | Vast.ai | $0.15/hr | First $5 credit | |