walidsobhie-code Claude Opus 4.6 commited on
Commit ·
239da7a
1
Parent(s): 65888d5
feat: Add free deployment support for Stack 2.9
Browse filesNew additions:
- Together AI fine-tuning script (free credits)
- HuggingFace Spaces deployment (free hosting)
- Free deployment guide with cost comparison
- Updated README with free tier options
Enables deployment on:
- HuggingFace Spaces (free inference API)
- Together AI (free fine-tuning)
- Google Colab (free training)
Recommended: Qwen2.5-Coder-7B for free tier
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- README.md +34 -1
- stack/deploy/FREE_DEPLOYMENT.md +132 -0
- stack/deploy/hfSpaces/Dockerfile +26 -0
- stack/deploy/hfSpaces/app.py +147 -0
- stack/training/together_finetune.py +138 -0
README.md
CHANGED
|
@@ -130,7 +130,23 @@ Stack 2.9 requires a GPU for optimal performance. Minimum and recommended config
|
|
| 130 |
- Multi-GPU (tensor parallelism) supported for large models
|
| 131 |
- Ensure NVIDIA drivers and CUDA toolkit are installed
|
| 132 |
|
| 133 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
|
| 135 |
### Interactive Chat
|
| 136 |
|
|
@@ -424,3 +440,20 @@ Licensed under the Apache License 2.0. See [LICENSE](LICENSE) for details.
|
|
| 424 |
<p align="center">
|
| 425 |
Built with ❤️ for developers who want an AI that grows with them
|
| 426 |
</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 130 |
- Multi-GPU (tensor parallelism) supported for large models
|
| 131 |
- Ensure NVIDIA drivers and CUDA toolkit are installed
|
| 132 |
|
| 133 |
+
### Free Deployment (No Cost)
|
| 134 |
+
|
| 135 |
+
Stack 2.9 can be deployed on free platforms:
|
| 136 |
+
|
| 137 |
+
| Platform | What's Free | How |
|
| 138 |
+
|----------|-------------|-----|
|
| 139 |
+
| **HuggingFace Spaces** | 2CPU 4GB inference | `stack/deploy/FREE_DEPLOYMENT.md` |
|
| 140 |
+
| **Together AI** | Fine-tuning credits | `stack/training/together_finetune.py` |
|
| 141 |
+
| **Google Colab** | ~0.5hr GPU/day | `colab_train_stack29.ipynb` |
|
| 142 |
+
|
| 143 |
+
**Recommended for free tier:**
|
| 144 |
+
- Model: `Qwen2.5-Coder-7B` (runs on free GPU)
|
| 145 |
+
- Fine-tune: Together AI (free credits)
|
| 146 |
+
- Deploy: HuggingFace Spaces (free hosting)
|
| 147 |
+
|
| 148 |
+
See `stack/deploy/FREE_DEPLOYMENT.md` for detailed guide.
|
| 149 |
+
For paid deployment (Docker, RunPod, Vast.ai), see `stack/deploy/README.md`.
|
| 150 |
|
| 151 |
### Interactive Chat
|
| 152 |
|
|
|
|
| 440 |
<p align="center">
|
| 441 |
Built with ❤️ for developers who want an AI that grows with them
|
| 442 |
</p>
|
| 443 |
+
|
| 444 |
+
|
| 445 |
+
### Free Deployment (No Cost)
|
| 446 |
+
|
| 447 |
+
Stack 2.9 can run on free platforms:
|
| 448 |
+
|
| 449 |
+
| Platform | What's Free | Recommended For |
|
| 450 |
+
|----------|-----------------|-----------------|
|
| 451 |
+
| **HuggingFace Spaces** | 2CPU 4GB hosting | API deployment |
|
| 452 |
+
| **Together AI** | Fine-tuning credits | Model customization |
|
| 453 |
+
| **Google Colab** | ~0.5hr GPU/day | Training experiments |
|
| 454 |
+
|
| 455 |
+
**Free tier model:** Use Qwen2.5-Coder-7B (runs on free GPU)
|
| 456 |
+
|
| 457 |
+
See `stack/deploy/FREE_DEPLOYMENT.md` for detailed guide.
|
| 458 |
+
|
| 459 |
+
For paid options see `stack/deploy/README.md`.
|
stack/deploy/FREE_DEPLOYMENT.md
ADDED
|
@@ -0,0 +1,132 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Free Deployment Guide for Stack 2.9
|
| 2 |
+
|
| 3 |
+
This guide covers deploying Stack 2.9 on free-tier platforms.
|
| 4 |
+
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
## Option 1: HuggingFace Spaces (Free Inference)
|
| 8 |
+
|
| 9 |
+
### Step 1: Create Space
|
| 10 |
+
```bash
|
| 11 |
+
# Go to https://huggingface.co/spaces and create new Space
|
| 12 |
+
# Choose: Docker, Python 3.11, Small (2CPU 4GB)
|
| 13 |
+
```
|
| 14 |
+
|
| 15 |
+
### Step 2: Push Your Model
|
| 16 |
+
```bash
|
| 17 |
+
# Upload your fine-tuned model to HF
|
| 18 |
+
from huggingface_hub import HfApi
|
| 19 |
+
api = HfApi()
|
| 20 |
+
api.upload_folder(
|
| 21 |
+
folder_path="./stack-2.9-7b",
|
| 22 |
+
repo_id="yourusername/stack-2.9",
|
| 23 |
+
repo_type="model"
|
| 24 |
+
)
|
| 25 |
+
```
|
| 26 |
+
|
| 27 |
+
### Step 3: Configure API URL
|
| 28 |
+
Set environment variable in Space:
|
| 29 |
+
- `API_URL`: Your model inference URL
|
| 30 |
+
- `HF_TOKEN`: Your HF token
|
| 31 |
+
|
| 32 |
+
### Step 4: Deploy
|
| 33 |
+
```bash
|
| 34 |
+
# Clone Space and push files
|
| 35 |
+
git clone https://huggingface.co/spaces/yourusername/stack-2.9
|
| 36 |
+
cp deploy/hfSpaces/* .
|
| 37 |
+
git add . && git push
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
---
|
| 41 |
+
|
| 42 |
+
## Option 2: Together AI Fine-tuning (Free Credits)
|
| 43 |
+
|
| 44 |
+
### Free Tier Limits
|
| 45 |
+
- Up to 7B model fine-tuning
|
| 46 |
+
- Limited training minutes (varies by promotion)
|
| 47 |
+
- Requires: Together AI account
|
| 48 |
+
|
| 49 |
+
### Setup
|
| 50 |
+
```bash
|
| 51 |
+
# Get API key from https://together.ai
|
| 52 |
+
export TOGETHER_API_KEY="your-key"
|
| 53 |
+
|
| 54 |
+
# Fine-tune 7B model (free-tier friendly)
|
| 55 |
+
python stack/training/together_finetune.py \
|
| 56 |
+
--model 7b \
|
| 57 |
+
--data data/final/train.jsonl \
|
| 58 |
+
--epochs 3
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
### Use Fine-tuned Model
|
| 62 |
+
```python
|
| 63 |
+
from together import Together
|
| 64 |
+
|
| 65 |
+
client = Together(api_key="your-key")
|
| 66 |
+
|
| 67 |
+
response = client.chat.completions.create(
|
| 68 |
+
model="your-finetuned-model",
|
| 69 |
+
messages=[{"role": "user", "content": "Write a function"}]
|
| 70 |
+
)
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
---
|
| 74 |
+
|
| 75 |
+
## Option 3: Google Colab (Free Training)
|
| 76 |
+
|
| 77 |
+
### Run Training
|
| 78 |
+
```python
|
| 79 |
+
# Open colab_train_stack29.ipynb in Google Colab
|
| 80 |
+
# Select GPU runtime (free tier: T4 15GB)
|
| 81 |
+
|
| 82 |
+
# For 7B model (runs on free tier):
|
| 83 |
+
batch_size = 2 # Reduce for 15GB VRAM
|
| 84 |
+
gradient_accumulation = 8
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
### Model Sizes for Free Tier
|
| 88 |
+
| Model | VRAM Needed | Free Tier? |
|
| 89 |
+
|-------|-------------|------------|
|
| 90 |
+
| 1.5B | ~4GB | ✅ Yes |
|
| 91 |
+
| 3B | ~8GB | ✅ Yes (T4) |
|
| 92 |
+
| 7B | ~16GB | ⚠️ Limited |
|
| 93 |
+
| 32B | ~64GB | ❌ No |
|
| 94 |
+
|
| 95 |
+
---
|
| 96 |
+
|
| 97 |
+
## Option 4: RunPod / Vast.ai (Cheap, Not Free)
|
| 98 |
+
|
| 99 |
+
### Quick Start
|
| 100 |
+
```bash
|
| 101 |
+
# Deploy on RunPod (~$0.20/hour for A100)
|
| 102 |
+
cd stack/deploy
|
| 103 |
+
./runpod_deploy.sh --template runpod-template.json
|
| 104 |
+
|
| 105 |
+
# Deploy on Vast.ai (~$0.15/hour)
|
| 106 |
+
./vastai_deploy.sh --template vastai-template.json
|
| 107 |
+
```
|
| 108 |
+
|
| 109 |
+
---
|
| 110 |
+
|
| 111 |
+
## Recommended Free Stack
|
| 112 |
+
|
| 113 |
+
```
|
| 114 |
+
┌─────────────────────────────────────────────┐
|
| 115 |
+
│ Stack 2.9 Free Deployment Stack │
|
| 116 |
+
├─────────────────────────────────────────────┤
|
| 117 |
+
│ Model: Qwen2.5-Coder-7B │
|
| 118 |
+
│ Fine-tune: Together AI (free credits) │
|
| 119 |
+
│ Deploy: HuggingFace Spaces (free) │
|
| 120 |
+
│ UI: Gradio (included in Spaces) │
|
| 121 |
+
└─────────────────────────────────────────────┘
|
| 122 |
+
```
|
| 123 |
+
|
| 124 |
+
## Cost Comparison
|
| 125 |
+
|
| 126 |
+
| Platform | Cost | What's Free |
|
| 127 |
+
|----------|------|-------------|
|
| 128 |
+
| HF Spaces | $0 | 2CPU 4GB hosting |
|
| 129 |
+
| Together AI | varies | Fine-tuning credits |
|
| 130 |
+
| Colab | $0 | ~0.5hr GPU/day |
|
| 131 |
+
| RunPod | $0.20/hr | First $10 credit |
|
| 132 |
+
| Vast.ai | $0.15/hr | First $5 credit |
|
stack/deploy/hfSpaces/Dockerfile
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# HuggingFace Spaces Dockerfile for Stack 2.9
|
| 2 |
+
# Use this for free inference hosting on HF Spaces
|
| 3 |
+
# https://huggingface.co/docs/hub/spaces-sdks-docker
|
| 4 |
+
|
| 5 |
+
FROM python:3.11-slim
|
| 6 |
+
|
| 7 |
+
# Set environment
|
| 8 |
+
ENV PYTHONUNBUFFERED=1
|
| 9 |
+
ENV PORT=7860
|
| 10 |
+
|
| 11 |
+
# Install dependencies
|
| 12 |
+
RUN pip install --no-cache-dir \
|
| 13 |
+
fastapi \
|
| 14 |
+
uvicorn[standard] \
|
| 15 |
+
pydantic \
|
| 16 |
+
requests \
|
| 17 |
+
huggingface_hub
|
| 18 |
+
|
| 19 |
+
# Copy app
|
| 20 |
+
COPY app.py .
|
| 21 |
+
|
| 22 |
+
# Expose port
|
| 23 |
+
EXPOSE 7860
|
| 24 |
+
|
| 25 |
+
# Run app
|
| 26 |
+
CMD ["python", "app.py"]
|
stack/deploy/hfSpaces/app.py
ADDED
|
@@ -0,0 +1,147 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
HuggingFace Spaces Deployment for Stack 2.9
|
| 3 |
+
|
| 4 |
+
Free inference API on HuggingFace Spaces.
|
| 5 |
+
https://huggingface.co/docs/hub/spaces-sdks-docker
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
# =============================================================================
|
| 9 |
+
# app.py - Stack 2.9 Inference API
|
| 10 |
+
# Deploy this to HuggingFace Spaces for free inference
|
| 11 |
+
# =============================================================================
|
| 12 |
+
|
| 13 |
+
import os
|
| 14 |
+
import json
|
| 15 |
+
from typing import Optional, List, Dict
|
| 16 |
+
from fastapi import FastAPI, HTTPException
|
| 17 |
+
from pydantic import BaseModel
|
| 18 |
+
import requests
|
| 19 |
+
|
| 20 |
+
app = FastAPI(title="Stack 2.9 API")
|
| 21 |
+
|
| 22 |
+
# Model configuration
|
| 23 |
+
MODEL_NAME = os.environ.get("MODEL_NAME", "Qwen/Qwen2.5-Coder-7B-Instruct")
|
| 24 |
+
API_URL = os.environ.get("API_URL", "") # Your model API URL
|
| 25 |
+
HF_TOKEN = os.environ.get("HF_TOKEN", "") # HuggingFace token
|
| 26 |
+
|
| 27 |
+
# ============================================================================
|
| 28 |
+
# Request/Response Models
|
| 29 |
+
# ============================================================================
|
| 30 |
+
|
| 31 |
+
class ChatMessage(BaseModel):
|
| 32 |
+
role: str
|
| 33 |
+
content: str
|
| 34 |
+
|
| 35 |
+
class ChatRequest(BaseModel):
|
| 36 |
+
messages: List[ChatMessage]
|
| 37 |
+
max_tokens: int = 1024
|
| 38 |
+
temperature: float = 0.7
|
| 39 |
+
top_p: float = 0.9
|
| 40 |
+
|
| 41 |
+
class ChatResponse(BaseModel):
|
| 42 |
+
content: str
|
| 43 |
+
model: str
|
| 44 |
+
usage: Optional[Dict] = None
|
| 45 |
+
|
| 46 |
+
class CompletionRequest(BaseModel):
|
| 47 |
+
prompt: str
|
| 48 |
+
max_tokens: int = 512
|
| 49 |
+
temperature: float = 0.7
|
| 50 |
+
|
| 51 |
+
# ============================================================================
|
| 52 |
+
# Health Check
|
| 53 |
+
# ============================================================================
|
| 54 |
+
|
| 55 |
+
@app.get("/health")
|
| 56 |
+
async def health():
|
| 57 |
+
return {"status": "healthy", "model": MODEL_NAME}
|
| 58 |
+
|
| 59 |
+
@app.get("/")
|
| 60 |
+
async def root():
|
| 61 |
+
return {
|
| 62 |
+
"name": "Stack 2.9",
|
| 63 |
+
"version": "1.0.0",
|
| 64 |
+
"model": MODEL_NAME,
|
| 65 |
+
"endpoints": {
|
| 66 |
+
"chat": "/v1/chat/completions",
|
| 67 |
+
"complete": "/v1/completions",
|
| 68 |
+
"health": "/health"
|
| 69 |
+
}
|
| 70 |
+
}
|
| 71 |
+
|
| 72 |
+
# ============================================================================
|
| 73 |
+
# OpenAI-Compatible API
|
| 74 |
+
# ============================================================================
|
| 75 |
+
|
| 76 |
+
@app.post("/v1/chat/completions", response_model=ChatResponse)
|
| 77 |
+
async def chat_completions(request: ChatRequest):
|
| 78 |
+
"""OpenAI-compatible chat endpoint"""
|
| 79 |
+
|
| 80 |
+
if API_URL:
|
| 81 |
+
# Use external API
|
| 82 |
+
response = requests.post(
|
| 83 |
+
f"{API_URL}/v1/chat/completions",
|
| 84 |
+
headers={"Authorization": f"Bearer {HF_TOKEN}"},
|
| 85 |
+
json={
|
| 86 |
+
"messages": [m.dict() for m in request.messages],
|
| 87 |
+
"max_tokens": request.max_tokens,
|
| 88 |
+
"temperature": request.temperature,
|
| 89 |
+
},
|
| 90 |
+
timeout=60
|
| 91 |
+
)
|
| 92 |
+
return response.json()
|
| 93 |
+
|
| 94 |
+
# Placeholder for local model
|
| 95 |
+
raise HTTPException(
|
| 96 |
+
status_code=503,
|
| 97 |
+
detail="No model API configured. Set API_URL environment variable."
|
| 98 |
+
)
|
| 99 |
+
|
| 100 |
+
@app.post("/v1/completions")
|
| 101 |
+
async def completions(request: CompletionRequest):
|
| 102 |
+
"""OpenAI-compatible completion endpoint"""
|
| 103 |
+
|
| 104 |
+
if API_URL:
|
| 105 |
+
response = requests.post(
|
| 106 |
+
f"{API_URL}/v1/completions",
|
| 107 |
+
headers={"Authorization": f"Bearer {HF_TOKEN}"},
|
| 108 |
+
json={
|
| 109 |
+
"prompt": request.prompt,
|
| 110 |
+
"max_tokens": request.max_tokens,
|
| 111 |
+
"temperature": request.temperature,
|
| 112 |
+
},
|
| 113 |
+
timeout=60
|
| 114 |
+
)
|
| 115 |
+
return response.json()
|
| 116 |
+
|
| 117 |
+
raise HTTPException(
|
| 118 |
+
status_code=503,
|
| 119 |
+
detail="No model API configured"
|
| 120 |
+
)
|
| 121 |
+
|
| 122 |
+
# ============================================================================
|
| 123 |
+
# Model Info
|
| 124 |
+
# ============================================================================
|
| 125 |
+
|
| 126 |
+
@app.get("/v1/models")
|
| 127 |
+
async def list_models():
|
| 128 |
+
return {
|
| 129 |
+
"object": "list",
|
| 130 |
+
"data": [
|
| 131 |
+
{
|
| 132 |
+
"id": MODEL_NAME,
|
| 133 |
+
"object": "model",
|
| 134 |
+
"created": 1700000000,
|
| 135 |
+
"owned_by": "stack-2.9"
|
| 136 |
+
}
|
| 137 |
+
]
|
| 138 |
+
}
|
| 139 |
+
|
| 140 |
+
# ============================================================================
|
| 141 |
+
# Run Server
|
| 142 |
+
# ============================================================================
|
| 143 |
+
|
| 144 |
+
if __name__ == "__main__":
|
| 145 |
+
import uvicorn
|
| 146 |
+
port = int(os.environ.get("PORT", "7860"))
|
| 147 |
+
uvicorn.run(app, host="0.0.0.0", port=port)
|
stack/training/together_finetune.py
ADDED
|
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Together AI Fine-tuning Script for Stack 2.9
|
| 3 |
+
|
| 4 |
+
Free fine-tuning on Together AI platform.
|
| 5 |
+
https://docs.together.ai/docs/fine-tuning
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
import os
|
| 9 |
+
import json
|
| 10 |
+
import requests
|
| 11 |
+
from typing import Optional
|
| 12 |
+
|
| 13 |
+
TOGETHER_API = "https://api.together.xyz/v1"
|
| 14 |
+
|
| 15 |
+
class TogetherFineTuner:
|
| 16 |
+
def __init__(self, api_key: str = None):
|
| 17 |
+
self.api_key = api_key or os.environ.get("TOGETHER_API_KEY")
|
| 18 |
+
if not self.api_key:
|
| 19 |
+
raise ValueError("TOGETHER_API_KEY required")
|
| 20 |
+
|
| 21 |
+
def upload_dataset(self, file_path: str) -> str:
|
| 22 |
+
"""Upload training data to Together AI"""
|
| 23 |
+
url = f"{TOGETHER_API}/files"
|
| 24 |
+
|
| 25 |
+
with open(file_path, 'rb') as f:
|
| 26 |
+
response = requests.post(
|
| 27 |
+
url,
|
| 28 |
+
headers={"Authorization": f"Bearer {self.api_key}"},
|
| 29 |
+
files={"file": f}
|
| 30 |
+
)
|
| 31 |
+
|
| 32 |
+
if response.status_code == 200:
|
| 33 |
+
return response.json()['id']
|
| 34 |
+
raise Exception(f"Upload failed: {response.text}")
|
| 35 |
+
|
| 36 |
+
def create_finetune_job(
|
| 37 |
+
self,
|
| 38 |
+
model: str,
|
| 39 |
+
training_file: str,
|
| 40 |
+
epochs: int = 3,
|
| 41 |
+
batch_size: int = 4,
|
| 42 |
+
learning_rate: float = 1e-5,
|
| 43 |
+
) -> dict:
|
| 44 |
+
"""
|
| 45 |
+
Create fine-tuning job on Together AI
|
| 46 |
+
|
| 47 |
+
Free tier: Up to 7B models, limited training minutes
|
| 48 |
+
"""
|
| 49 |
+
url = f"{TOGETHER_API}/fine_tuning/jobs"
|
| 50 |
+
|
| 51 |
+
payload = {
|
| 52 |
+
"model": model, # e.g., "Qwen/Qwen2.5-Coder-7B"
|
| 53 |
+
"training_file": training_file,
|
| 54 |
+
"epochs": epochs,
|
| 55 |
+
"batch_size": batch_size,
|
| 56 |
+
"learning_rate": learning_rate,
|
| 57 |
+
"lora": True, # Enable LoRA for efficiency
|
| 58 |
+
"lora_r": 64,
|
| 59 |
+
"lora_alpha": 128,
|
| 60 |
+
}
|
| 61 |
+
|
| 62 |
+
response = requests.post(
|
| 63 |
+
url,
|
| 64 |
+
headers={
|
| 65 |
+
"Authorization": f"Bearer {self.api_key}",
|
| 66 |
+
"Content-Type": "application/json"
|
| 67 |
+
},
|
| 68 |
+
json=payload
|
| 69 |
+
)
|
| 70 |
+
|
| 71 |
+
if response.status_code == 200:
|
| 72 |
+
return response.json()
|
| 73 |
+
raise Exception(f"Job creation failed: {response.text}")
|
| 74 |
+
|
| 75 |
+
def get_job_status(self, job_id: str) -> dict:
|
| 76 |
+
"""Check fine-tuning job status"""
|
| 77 |
+
url = f"{TOGETHER_API}/fine_tuning/jobs/{job_id}"
|
| 78 |
+
|
| 79 |
+
response = requests.get(
|
| 80 |
+
url,
|
| 81 |
+
headers={"Authorization": f"Bearer {self.api_key}"}
|
| 82 |
+
)
|
| 83 |
+
|
| 84 |
+
return response.json()
|
| 85 |
+
|
| 86 |
+
def list_fine_tuned_models(self) -> list:
|
| 87 |
+
"""List your fine-tuned models"""
|
| 88 |
+
url = f"{TOGETHER_API}/fine_tuning/models"
|
| 89 |
+
|
| 90 |
+
response = requests.get(
|
| 91 |
+
url,
|
| 92 |
+
headers={"Authorization": f"Bearer {self.api_key}"}
|
| 93 |
+
)
|
| 94 |
+
|
| 95 |
+
return response.json().get('models', [])
|
| 96 |
+
|
| 97 |
+
|
| 98 |
+
# Recommended models for free tier
|
| 99 |
+
FREE_TIER_MODELS = {
|
| 100 |
+
"7b": "Qwen/Qwen2.5-Coder-7B",
|
| 101 |
+
"3b": "Qwen/Qwen2.5-Coder-3B",
|
| 102 |
+
"1.5b": "Qwen/Qwen2.5-Coder-1.5B",
|
| 103 |
+
}
|
| 104 |
+
|
| 105 |
+
def main():
|
| 106 |
+
import argparse
|
| 107 |
+
parser = argparse.ArgumentParser(description="Fine-tune on Together AI")
|
| 108 |
+
parser.add_argument("--api-key", type=str, help="Together AI API key")
|
| 109 |
+
parser.add_argument("--model", default="7b", choices=["7b", "3b", "1.5b"],
|
| 110 |
+
help="Model size")
|
| 111 |
+
parser.add_argument("--data", required=True, help="Training data file (JSONL)")
|
| 112 |
+
parser.add_argument("--epochs", type=int, default=3)
|
| 113 |
+
|
| 114 |
+
args = parser.parse_args()
|
| 115 |
+
|
| 116 |
+
tuner = TogetherFineTuner(args.api_key)
|
| 117 |
+
|
| 118 |
+
# Upload data
|
| 119 |
+
print("Uploading dataset...")
|
| 120 |
+
file_id = tuner.upload_dataset(args.data)
|
| 121 |
+
print(f"Uploaded: {file_id}")
|
| 122 |
+
|
| 123 |
+
# Start job
|
| 124 |
+
model_name = FREE_TIER_MODELS[args.model]
|
| 125 |
+
print(f"Starting fine-tune on {model_name}...")
|
| 126 |
+
|
| 127 |
+
job = tuner.create_finetune_job(
|
| 128 |
+
model=model_name,
|
| 129 |
+
training_file=file_id,
|
| 130 |
+
epochs=args.epochs,
|
| 131 |
+
)
|
| 132 |
+
|
| 133 |
+
print(f"Job created: {job['id']}")
|
| 134 |
+
print(f"Status: {job['status']}")
|
| 135 |
+
|
| 136 |
+
|
| 137 |
+
if __name__ == "__main__":
|
| 138 |
+
main()
|