Stack-2-9-finetuned / stack /deploy /FREE_DEPLOYMENT.md

walidsobhie-code

feat: Add free deployment support for Stack 2.9

239da7a 22 days ago

3.32 kB

	# Free Deployment Guide for Stack 2.9

	This guide covers deploying Stack 2.9 on free-tier platforms.

	---

	## Option 1: HuggingFace Spaces (Free Inference)

	### Step 1: Create Space
	```bash
	# Go to https://huggingface.co/spaces and create new Space
	# Choose: Docker, Python 3.11, Small (2CPU 4GB)
	```

	### Step 2: Push Your Model
	```bash
	# Upload your fine-tuned model to HF
	from huggingface_hub import HfApi
	api = HfApi()
	api.upload_folder(
	folder_path="./stack-2.9-7b",
	repo_id="yourusername/stack-2.9",
	repo_type="model"
	)
	```

	### Step 3: Configure API URL
	Set environment variable in Space:
	- `API_URL`: Your model inference URL
	- `HF_TOKEN`: Your HF token

	### Step 4: Deploy
	```bash
	# Clone Space and push files
	git clone https://huggingface.co/spaces/yourusername/stack-2.9
	cp deploy/hfSpaces/* .
	git add . && git push
	```

	---

	## Option 2: Together AI Fine-tuning (Free Credits)

	### Free Tier Limits
	- Up to 7B model fine-tuning
	- Limited training minutes (varies by promotion)
	- Requires: Together AI account

	### Setup
	```bash
	# Get API key from https://together.ai
	export TOGETHER_API_KEY="your-key"

	# Fine-tune 7B model (free-tier friendly)
	python stack/training/together_finetune.py \
	--model 7b \
	--data data/final/train.jsonl \
	--epochs 3
	```

	### Use Fine-tuned Model
	```python
	from together import Together

	client = Together(api_key="your-key")

	response = client.chat.completions.create(
	model="your-finetuned-model",
	messages=[{"role": "user", "content": "Write a function"}]
	)
	```

	---

	## Option 3: Google Colab (Free Training)

	### Run Training
	```python
	# Open colab_train_stack29.ipynb in Google Colab
	# Select GPU runtime (free tier: T4 15GB)

	# For 7B model (runs on free tier):
	batch_size = 2 # Reduce for 15GB VRAM
	gradient_accumulation = 8
	```

	### Model Sizes for Free Tier
	\| Model \| VRAM Needed \| Free Tier? \|
	\|-------\|-------------\|------------\|
	\| 1.5B \| ~4GB \| ✅ Yes \|
	\| 3B \| ~8GB \| ✅ Yes (T4) \|
	\| 7B \| ~16GB \| ⚠️ Limited \|
	\| 32B \| ~64GB \| ❌ No \|

	---

	## Option 4: RunPod / Vast.ai (Cheap, Not Free)

	### Quick Start
	```bash
	# Deploy on RunPod (~$0.20/hour for A100)
	cd stack/deploy
	./runpod_deploy.sh --template runpod-template.json

	# Deploy on Vast.ai (~$0.15/hour)
	./vastai_deploy.sh --template vastai-template.json
	```

	---

	## Recommended Free Stack

	```
	┌─────────────────────────────────────────────┐
	│ Stack 2.9 Free Deployment Stack │
	├─────────────────────────────────────────────┤
	│ Model: Qwen2.5-Coder-7B │
	│ Fine-tune: Together AI (free credits) │
	│ Deploy: HuggingFace Spaces (free) │
	│ UI: Gradio (included in Spaces) │
	└─────────────────────────────────────────────┘
	```

	## Cost Comparison

	\| Platform \| Cost \| What's Free \|
	\|----------\|------\|-------------\|
	\| HF Spaces \| $0 \| 2CPU 4GB hosting \|
	\| Together AI \| varies \| Fine-tuning credits \|
	\| Colab \| $0 \| ~0.5hr GPU/day \|
	\| RunPod \| $0.20/hr \| First $10 credit \|
	\| Vast.ai \| $0.15/hr \| First $5 credit \|