Stack 2.9 β 5-Minute Quick Start
Goal: Get Stack 2.9 running and solving coding tasks in under 5 minutes.
Stack 2.9 is an AI coding assistant powered by Qwen2.5-Coder-32B with Pattern Memory β it learns from your interactions and improves over time.
π Prerequisites
Required
| Requirement | Version | Check |
|---|---|---|
| Python | 3.10+ | python3 --version |
| Git | Any recent | git --version |
| pip | Latest | pip --version |
Optional (Recommended)
| Resource | Why You Need It | Minimum |
|---|---|---|
| GPU | Fast code generation | RTX 3070 / M1 Pro |
| 16GB VRAM | Run 32B model smoothly | 8GB for 7B quantized |
No GPU? Stack 2.9 works on CPU via Ollama or cloud providers (OpenAI, Together AI, etc.).
β‘ Step 1 β Install in 60 Seconds
# 1. Clone the repository
git clone https://github.com/my-ai-stack/stack-2.9.git
cd stack-2.9
# 2. Create a virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
# 4. Copy environment template
cp .env.example .env
That's it. If you hit errors, see Troubleshooting below.
π Step 2 β Configure Your Model Provider
Stack 2.9 supports multiple LLM providers. Pick one that matches your setup:
Option A: Ollama (Recommended β Local, Private)
# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.ai/install.sh | sh
# Pull the Qwen model
ollama pull qwen2.5-coder:32b
# Set environment
export MODEL_PROVIDER=ollama
export OLLAMA_MODEL=qwen2.5-coder:32b
Edit your .env file:
MODEL_PROVIDER=ollama
OLLAMA_MODEL=qwen2.5-coder:32b
Option B: Together AI (Best for Qwen, Cloud)
# Get your API key at https://together.ai
export TOGETHER_API_KEY=tog-your-key-here
Edit your .env:
MODEL_PROVIDER=together
TOGETHER_API_KEY=tog-your-key-here
TOGETHER_MODEL=togethercomputer/qwen2.5-32b-instruct
Option C: OpenAI (GPT-4o)
MODEL_PROVIDER=openai
OPENAI_API_KEY=sk-your-key-here
OPENAI_MODEL=gpt-4o
Option D: Anthropic (Claude)
MODEL_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-your-key-here
ANTHROPIC_MODEL=claude-3-5-sonnet-20240229
Option E: OpenRouter (Unified Access)
MODEL_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-your-key-here
OPENROUTER_MODEL=openai/gpt-4o
π Step 3 β Run Your First Task
Interactive Chat Mode
python stack.py
You'll see:
ββββββββββββββββββββββββββββββββββββββββββββββββ
β Stack 2.9 β AI Coding Assistant β
β Pattern Memory: Active | Tools: 46 β
ββββββββββββββββββββββββββββββββββββββββββββββββ
You: Write a Python function to reverse a string
Single Query Mode
python stack.py -c "Write a Python function to reverse a string"
Expected output:
def reverse_string(s):
"""Reverse a string and return it."""
return s[::-1]
# Or for a more robust version:
def reverse_string(s):
return ''.join(reversed(s))
Ask About Your Codebase
python stack.py -c "Find all Python files modified in the last week and list them"
Generate and Run Code
python stack.py -c "Create a hello world Flask app with one route"
π Step 4 β Run Evaluation (Optional)
Note: Evaluation requires a GPU with ~16GB VRAM or more.
Prepare Your Fine-Tuned Model
After training Stack 2.9 on your data, your merged model will be in:
./output/merged/
Run HumanEval Benchmark
python evaluate_model.py \
--model-path ./output/merged \
--benchmark humaneval \
--num-samples 10 \
--output results.json
Run MBPP Benchmark
python evaluate_model.py \
--model-path ./output/merged \
--benchmark mbpp \
--num-samples 10 \
--output results.json
Run Both Benchmarks
python evaluate_model.py \
--model-path ./output/merged \
--benchmark both \
--num-samples 10 \
--k-values 1,10 \
--output results.json
Expected output format: ```
HumanEval Results
pass@1: 65.00% pass@10: 82.00% Total problems evaluated: 12
============================================================ MBPP Results
pass@1: 70.00% pass@10: 85.00% Total problems evaluated: 12
### Quick Evaluation (5 Problems Only)
```bash
python evaluate_model.py \
--model-path ./output/merged \
--benchmark humaneval \
--num-problems 5 \
--num-samples 5
π³ Step 5 β Deploy Stack 2.9
Deploy Locally with Docker
# Start the container
docker build -t stack-2.9 .
docker run -p 7860:7860 \
-e MODEL_PROVIDER=ollama \
-e OLLAMA_MODEL=qwen2.5-coder:32b \
stack-2.9
Access at: http://localhost:7860
Deploy to RunPod (Cloud GPU)
# Edit runpod_deploy.sh with your config first
bash runpod_deploy.sh --gpu a100 --instance hourly
Deploy to Kubernetes
# 1. Edit k8s/secret.yaml with your HuggingFace token
# 2. Apply the manifests
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/secret.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/pvc.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
# Check status
kubectl get pods -n stack-29
kubectl logs -n stack-29 deployment/stack-29
Hardware Requirements for Deployment
| Model Size | Minimum GPU | Recommended | Quantized (4-bit) |
|---|---|---|---|
| 7B | RTX 3070 (8GB) | A100 40GB | RTX 3060 (6GB) |
| 32B | A100 40GB | A100 80GB | RTX 3090 (24GB) |
π§ Pattern Memory Quick Guide
Stack 2.9 stores successful patterns to help with future tasks.
List Your Patterns
python stack.py --patterns list
python stack.py --patterns stats
Extract Patterns from Your Git History
python scripts/extract_patterns_from_git.py \
--repo-path . \
--output patterns.jsonl \
--since-date "2024-01-01"
Merge LoRA Adapters (Team Sharing)
python scripts/merge_lora_adapters.py \
--adapters adapter_a.safetensors adapter_b.safetensors \
--weights 0.7 0.3 \
--output merged.safetensors
π οΈ Troubleshooting
"Module not found" errors
pip install -r requirements.txt
"CUDA out of memory" during evaluation
# Reduce batch size
python evaluate_model.py --model-path ./merged --num-samples 5
# Or use 4-bit quantization
# (See docs/TRAINING_7B.md for quantized training)
"Model not found" with Ollama
ollama pull qwen2.5-coder:32b
ollama list # Verify it's installed
"API key not set" errors
# Double-check your .env file
cat .env
# For testing, you can also set inline
export TOGETHER_API_KEY=tog-your-key
Slow inference on CPU
# Use a smaller model
export OLLAMA_MODEL=qwen2.5-coder:7b
# Or switch to cloud
export MODEL_PROVIDER=together
Docker build fails
# Use Python 3.10 explicitly
docker build --build-arg PYTHON_VERSION=3.10 -t stack-2.9 .
Kubernetes GPU not found
# Verify nvidia.com/gpu label on your node
kubectl get nodes -L nvidia.com/gpu
# Install NVIDIA GPU Operator if missing
# https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/
π What's Next?
| Goal | Go To |
|---|---|
| Train on my own data | docs/TRAINING_7B.md |
| Learn all 46 tools | TOOLS.md |
| Set up team pattern sharing | docs/pattern-moat.md |
| Understand the architecture | docs/reference/ARCHITECTURE.md |
| Report a bug | SECURITY.md / GitHub Issues |
β‘ Quick Reference Card
# Install
git clone https://github.com/my-ai-stack/stack-2.9.git
cd stack-2.9 && pip install -r requirements.txt
# Configure
cp .env.example .env # Edit with your API keys
# Run
python stack.py # Interactive
python stack.py -c "your code request" # Single query
# Evaluate
python evaluate_model.py --model-path ./merged --benchmark humaneval
# Deploy
docker build -t stack-2.9 . && docker run -p 7860:7860 stack-2.9
Stack 2.9 β AI that learns your patterns and grows with you.