File size: 8,655 Bytes
d083607 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 | # Pattern Memory Evolution
The Pattern Memory Moat is a system for capturing, storing, and sharing code patterns across teams. It transforms individual learning into collective intelligence.
## Table of Contents
1. [Auto-Extraction](#auto-extraction)
2. [Team Sync](#team-sync)
3. [Weight Fusion](#weight-fusion)
4. [API Reference](#api-reference)
---
## Auto-Extraction
Extract patterns automatically from your Git history. The system analyzes commit messages, identifies bug fixes and features, and stores the before/after code changes.
### How It Works
The `extract_patterns_from_git.py` script:
1. **Scans Git History**: Reads through commit messages and diffs
2. **Identifies Patterns**: Uses keywords to classify commits as bug fixes or features
3. **Extracts Context**: Captures before/after code with metadata
4. **Stores in JSONL**: Outputs structured data suitable for training
### Usage
```bash
# Extract patterns from all commits
python scripts/extract_patterns_from_git.py \
--repo-path /path/to/repo \
--output patterns.jsonl
# Only recent commits
python scripts/extract_patterns_from_git.py \
--repo-path /path/to/repo \
--output patterns.jsonl \
--since-date "2024-01-01"
```
### Output Format
Each line in the JSONL output:
```json
{
"pattern_id": "a1b2c3d4e5f6g7h8",
"problem_type": "bug_fix",
"before_code": "def buggy_function():\n return None + 1",
"after_code": "def fixed_function():\n return 1",
"commit_msg": "fix: handle None case in function",
"author": "developer@example.com",
"date": "2024-03-15 10:30:00",
"confidence": 0.85
}
```
### Problem Types
- `bug_fix`: Commits that resolve issues (keywords: fix, bug, hotfix, patch, resolve)
- `feature_addition`: Commits that add new functionality (keywords: feat, add, implement, enhance)
- `unknown`: Other commits (typically skipped)
### Confidence Scoring
The confidence score (0.0-1.0) reflects pattern quality:
- Base: 0.5
- +0.2 for clear bug fix keywords
- +0.15 for clear feature keywords
- +0.15 for having both before and after code
- +0.1 for substantial changes (>100 chars)
- +0.1 for large changes (>500 chars)
---
## Team Sync
Share and sync patterns across your team using a shared PostgreSQL database.
### PostgreSQL Schema
```sql
CREATE TABLE patterns (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
problem_type VARCHAR(50) NOT NULL,
solution_hash VARCHAR(64) NOT NULL,
code_before TEXT NOT NULL,
code_after TEXT NOT NULL,
success_count INTEGER DEFAULT 0,
last_used TIMESTAMP,
created_by VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-- Indexes
CONSTRAINT unique_solution UNIQUE (solution_hash),
INDEX idx_problem_type (problem_type),
INDEX idx_success_count (success_count DESC)
);
CREATE TABLE pattern_feedback (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
pattern_id UUID REFERENCES patterns(id),
user_id VARCHAR(255) NOT NULL,
helpful BOOLEAN NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE adapter_versions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
version_name VARCHAR(100) NOT NULL,
adapter_path VARCHAR(500) NOT NULL,
created_by VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
is_active BOOLEAN DEFAULT FALSE
);
```
### FastAPI Endpoints
#### GET /patterns
List patterns with filtering and pagination.
```bash
curl -H "X-API-Key: your-api-key" \
"http://localhost:8000/patterns?problem_type=bug_fix&limit=20"
```
Response:
```json
{
"patterns": [...],
"total": 150,
"page": 1,
"per_page": 20
}
```
#### POST /patterns
Add a new pattern.
```bash
curl -X POST -H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{"problem_type": "bug_fix", "code_before": "...", "code_after": "..."}' \
"http://localhost:8000/patterns"
```
#### POST /patterns/{id}/feedback
Submit feedback on a pattern.
```bash
curl -X POST -H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{"helpful": true}' \
"http://localhost:8000/patterns/123e4567-e89b-12d3-a456-426614174000/feedback"
```
### Authentication
API key authentication via `X-API-Key` header:
```python
# Server-side middleware
async def verify_api_key(request: Request, call_next):
api_key = request.headers.get("X-API-Key")
if not api_key or api_key != settings.API_KEY:
raise HTTPException(status_code=401, detail="Invalid API key")
return await call_next(request)
```
### Conflict Resolution
When multiple team members contribute similar patterns:
1. **Pattern Similarity Detection**: Hash-based deduplication
2. **Merge Strategy**: Patterns with similar `solution_hash` are merged
3. **Success Rate Tracking**: `success_count` increases with positive feedback
4. **Priority**: Patterns with higher `success_count` rank higher in queries
---
## Weight Fusion
Combine LoRA adapters from multiple users using weighted averaging based on success rates.
### Algorithm
```
merged_weight = Σ(adapter_i.weight * adapter_i.success_rate) / Σ(success_rate)
```
This ensures adapters that have shown better results contribute more to the final merged adapter.
### Merge Script Usage
```bash
# Basic merge with manual weights
python scripts/merge_lora_adapters.py \
--adapters user1_adapter.safetensors user2_adapter.safetensors \
--weights 0.6 0.4 \
--output merged_adapter.safetensors
# Merge using success rates (auto-computes proportional weights)
python scripts/merge_lora_adapters.py \
--adapters alice_adapter.safetensors bob_adapter.safetensors \
--success-rates 0.85 0.65 \
--output team_adapter.safetensors
# Equal weights (default)
python scripts/merge_lora_adapters.py \
--adapters adapter1.safetensors adapter2.safetensors \
--output merged.safetensors
```
### Versioning
Each merge creates a version record:
```json
{
"version_name": "v2.1-team-merge",
"adapter_path": "/adapters/merged_v2.1.safetensors",
"created_by": "alice@example.com",
"created_at": "2024-03-15T10:30:00Z",
"parent_versions": ["v2.0", "user-alice-v3", "user-bob-v2"]
}
```
### Rollback
To revert to a previous merged adapter:
```bash
# List available versions
ls -la adapters/versions/
# Restore previous version
cp adapters/versions/v2.0.safetensors adapters/merged.safetensors
```
Or via API:
```bash
curl -X POST -H "X-API-Key: your-api-key" \
-d '{"version_id": "123e4567-e89b-12d3-a456-426614174000"}' \
"http://localhost:8000/adapters/rollback"
```
---
## API Reference
### Patterns API
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/patterns` | List patterns |
| GET | `/patterns/{id}` | Get pattern by ID |
| POST | `/patterns` | Create pattern |
| POST | `/patterns/{id}/feedback` | Submit feedback |
| DELETE | `/patterns/{id}` | Delete pattern |
### Adapter API
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/adapters` | List adapter versions |
| POST | `/adapters/merge` | Merge multiple adapters |
| POST | `/adapters/{id}/activate` | Set as active adapter |
| POST | `/adapters/rollback` | Rollback to previous version |
### Health Check
```bash
curl "http://localhost:8000/health"
```
Response:
```json
{
"status": "healthy",
"version": "1.0.0",
"database": "connected"
}
```
---
## Example Workflow
### 1. Extract Patterns from Project
```bash
# Extract patterns from your codebase
python scripts/extract_patterns_from_git.py \
--repo-path ./my-project \
--output patterns.jsonl \
--since-date "2024-01-01"
```
### 2. Upload to Team Database
```python
import requests
with open('patterns.jsonl') as f:
for line in f:
pattern = json.loads(line)
requests.post(
"http://team-patterns.example.com/patterns",
headers={"X-API-Key": "your-key"},
json=pattern
)
```
### 3. Merge Team Patterns
```bash
# Merge adapters from team members
python scripts/merge_lora_adapters.py \
--adapters alice_adapter.safetensors bob_adapter.safetensors carol_adapter.safetensors \
--success-rates 0.90 0.75 0.85 \
--output team_merged.safetensors
```
### 4. Activate for Team Use
The merged adapter with the highest success rate becomes the new team baseline.
---
## Files Reference
| File | Description |
|------|-------------|
| `scripts/extract_patterns_from_git.py` | Git history pattern extractor |
| `scripts/merge_lora_adapters.py` | LoRA adapter merger |
| `docs/pattern-moat.md` | This documentation |
|