File size: 6,071 Bytes
1f1f54b
210535c
 
 
 
1f1f54b
 
210535c
 
 
 
1f1f54b
 
210535c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2541228
 
 
210535c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2541228
 
 
 
210535c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2541228
 
 
 
 
 
 
 
 
 
210535c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
---
title: SQL Query Optimizer Environment Server
emoji: 🐳
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860
base_path: /web
tags:
  - openenv
---

# SQL Query Optimizer β€” OpenEnv Environment

An **OpenEnv-compliant** environment where AI agents learn to review, rewrite, and optimise SQL queries across three real-world failure patterns.

> **HF Spaces**: [param20h/sql-query-optimizer](https://huggingface.co/spaces/param20h/sql-query-optimizer)

---

## Environment Description

Real-world SQL anti-patterns cost companies millions in infrastructure. This environment teaches agents to identify and fix them through a reward-shaped episode loop. Each episode presents the agent with a broken or unoptimised query alongside schema context; the agent iteratively rewrites it until done or max steps are reached.

**Why this domain?**
- Used by data engineers and DBAs every day
- Deterministically gradeable (no ambiguous LLM judging)
- Natural difficulty progression from syntax errors to multi-factor optimisation

---

## Observation Space

| Field | Type | Description |
|---|---|---|
| `task_id` | `int` | Task number (1–3) |
| `task_name` | `str` | Slug identifier |
| `task_description` | `str` | What the agent must accomplish |
| `query` | `str` | The SQL to fix |
| `schema_context` | `str` | Relevant DDL / table definitions |
| `hint` | `str \| null` | Optional hint (tasks 1 & 2 only) |
| `step_number` | `int` | Current step (0-indexed) |
| `max_steps` | `int` | Steps allowed per episode |
| `done` | `bool` | Whether episode has ended |

---

## Action Space

| Field | Type | Description |
|---|---|---|
| `rewritten_query` | `str` | The agent's improved SQL |
| `explanation` | `str` | Brief description of changes made |
| `is_done` | `bool` | `true` when the agent believes the query is fully fixed |

---

## Reward Design

The reward is **shaped** (not sparse) β€” the agent receives signal every step:

| Component | Value | Trigger |
|---|---|---|
| Delta reward | +0.0–0.50 Γ— Ξ”grader | Grader score improves |
| Completion bonus | +0.50 | `is_done=True` and grader β‰₯ 0.80 |
| Partial completion | +grader Γ— 0.30 | `is_done=True` (always) |
| Step penalty | βˆ’0.02 / step | After halfway point, if not done |
| Invalid penalty | βˆ’0.10 | Empty or unparseable query |

Final `score` per step is clamped to `[0.0, 1.0]`.

---

## Tasks

### Task 1 β€” `fix-broken-join` (Easy)
The query uses a comma-separated cross-join (`FROM orders, customers`) without any join condition, causing a Cartesian product. The agent must rewrite with `INNER JOIN … ON o.customer_id = c.customer_id`.

**Max steps**: 3 | **Grader**: checks JOIN keyword + ON clause with correct key

### Task 2 β€” `eliminate-n-plus-one` (Medium)
A correlated scalar subquery in the `SELECT` list executes once per row (N+1 problem). The agent must collapse it into a single `LEFT JOIN departments ON e.dept_id = d.dept_id`.

**Max steps**: 4 | **Grader**: checks subquery removal + JOIN on dept_id

### Task 3 β€” `full-optimization` (Hard)
Four independent issues to fix:
1. Remove redundant `DISTINCT` (PK join makes it unnecessary)
2. Replace `SELECT *` with explicit columns
3. Replace `CAST(price AS VARCHAR) LIKE '1%'` β†’ `price >= 100 AND price < 200` (sargable)
4. Add an index hint comment for `(category, price)`

**Max steps**: 5 | **Grader**: 4 Γ— 0.25 sub-criteria, fully independent

---

## API Endpoints

| Method | Path | Description |
|---|---|---|
| `GET` | `/` | Health check |
| `POST` | `/reset` | Start episode `{ "task_id": 1 }` |
| `POST` | `/step` | Submit action `{ "rewritten_query": "...", "explanation": "...", "is_done": true }` |
| `GET` | `/state` | Current internal state |
| `GET` | `/tasks` | All tasks + action schema |
| `GET` | `/grader` | Grader score for current episode |
| `POST` | `/baseline` | Run baseline inference (requires `OPENAI_API_KEY`) |

Interactive docs: `http://localhost:7860/docs`

---

## Setup & Usage

### Prerequisites
- Python 3.10+
- Docker
- `API_BASE_URL` (OpenAI-compatible endpoint for inference)
- `MODEL_NAME` (model identifier for inference)
- `HF_TOKEN` (API key / bearer token for inference)

### Local (Python)

```bash
pip install -r requirements.txt
uvicorn server:app --host 0.0.0.0 --port 7860 --reload
```

### Local (Docker)

```bash
docker build -t sql-optimizer-env .
docker run -p 7860:7860 -e OPENAI_API_KEY=sk-... sql-optimizer-env
```

### Baseline Inference

```bash
$env:API_BASE_URL="https://api.openai.com/v1"
$env:MODEL_NAME="gpt-4o-mini"
$env:HF_TOKEN="hf_or_openai_api_key_here"
python inference.py
```

### OpenEnv Validation

```bash
pip install openenv-core
openenv validate
```

### Deploy to HF Spaces

```bash
pip install huggingface_hub
huggingface-cli login
openenv push --repo-id your-username/sql-query-optimizer
```

### Environment Configuration

Define these variables before running inference or `/baseline`:

```powershell
$env:API_BASE_URL = "https://api.openai.com/v1"
$env:MODEL_NAME = "gpt-4o-mini"
$env:HF_TOKEN = "your_api_key"
```

---

## Baseline Scores

Measured with `gpt-4o-mini` at `temperature=0`, single-pass:

| Task | Name | Difficulty | Grader Score |
|---|---|---|---|
| 1 | fix-broken-join | Easy | 0.86 |
| 2 | eliminate-n-plus-one | Medium | 0.72 |
| 3 | full-optimization | Hard | 0.50 |
| β€” | **Average** | β€” | **0.69** |

> Scores are reproducible: same model, same temperature, same grader β†’ same output.

---

## Project Structure

```
metaXscaler/
β”œβ”€β”€ env/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ environment.py   # reset(), step(), state()
β”‚   β”œβ”€β”€ models.py        # Observation, Action, Reward (Pydantic)
β”‚   β”œβ”€β”€ tasks.py         # Task definitions + graders
β”‚   └── reward.py        # Shaped reward function
β”œβ”€β”€ server.py            # FastAPI app
β”œβ”€β”€ baseline.py          # Baseline inference script
β”œβ”€β”€ openenv.yaml         # OpenEnv spec metadata
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ requirements.txt
└── README.md
```

---

## License

MIT