Spaces:
Sleeping
Sleeping
File size: 7,044 Bytes
aa3a171 5db060f aa3a171 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 | ---
title: Sql Debug Environment Server
emoji: ๐
colorFrom: pink
colorTo: red
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
- openenv
- sql
- debugging
- optimization
---
# ๐ OpenEnv: SQL Debug Environment
An [OpenEnv](https://openenv.dev)-compliant environment where AI agents fix broken SQL queries and optimize slow ones against in-memory SQLite databases.
> โ
**Validator:** `openenv validate` passes when the environment is wired up correctly
> ๐ **Local API:** `https://abhinavthedev-sql-debug.hf.space`
> ๐ **Swagger UI:** `https://abhinavthedev-sql-debug.hf.space/docs`
---
## ๐ฏ Environment Description
This environment simulates the work of a SQL engineer who must repair syntax errors, correct logic bugs, and improve query performance. Agents receive a schema, a broken or slow query, and a natural-language target description. They submit SQL queries, observe the execution result and query plan, and are scored on correctness and efficiency.
The environment is intentionally practical: each task mirrors a real debugging pattern used in analytics, reporting, and data engineering workflows.
---
## ๐ Tasks
### Task 1 - Syntax Fix *(Easy)*
**Task ID:** `syntax_fix_001`
**Objective:** Fix a malformed query so it returns all orders where `amount > 500`.
| Field | Description |
|---|---|
| `schema` | `orders` table with `id`, `customer`, `amount`, `order_date` |
| `broken_query` | `SELEC * FORM orders WERE amount > 500` |
| `target` | Return all orders where amount is greater than 500 |
**Max steps:** 5 | **Difficulty:** Easy
---
### Task 2 - Logic Fix *(Medium)*
**Task ID:** `logic_fix_001`
**Objective:** Correct a join bug so only employees in valid departments are returned.
| Field | Description |
|---|---|
| `schema` | `employees` and `departments` tables |
| `broken_query` | Query uses `LEFT JOIN` but should exclude missing departments |
| `target` | Return employees in departments with budget > 400000 |
**Max steps:** 8 | **Difficulty:** Medium
---
### Task 3 - Query Optimization *(Hard)*
**Task ID:** `optimize_001`
**Objective:** Rewrite a correlated subquery into an efficient CTE or grouped subquery.
| Field | Description |
|---|---|
| `schema` | `transactions` table with generated sample rows |
| `broken_query` | Correlated subquery that scans per row |
| `target` | Return completed transactions above the user's average amount |
**Max steps:** 10 | **Difficulty:** Hard
---
## ๐ API Reference
### Base URL
```text
https://abhinavthedev-sql-debug.hf.space
```
### Core Endpoints
| Method | Endpoint | Description |
|---|---|---|
| `POST` | `/reset` | Start a new episode; pass `task_id` to choose a task |
| `POST` | `/step` | Submit a SQL query and receive the next observation |
| `GET` | `/state/{session_id}` | Inspect the current episode state |
| `GET` | `/schema` | View action, observation, and state schemas |
| `GET` | `/ws` | WebSocket endpoint for low-latency sessions |
| `GET` | `/health` | Health check |
| `GET` | `/docs` | Swagger UI |
---
## ๐ฎ Action Space
The agent submits a single SQL query each step.
```json
{
"query": "SELECT * FROM orders WHERE amount > 500"
}
```
### Example Actions
```json
{ "query": "SELECT * FROM orders WHERE amount > 500" }
{ "query": "SELECT e.name, d.dept_name FROM employees e INNER JOIN departments d ON e.dept_id = d.id WHERE d.budget > 400000" }
{ "query": "WITH avg_amount AS (SELECT user_id, AVG(amount) AS avg_amount FROM transactions GROUP BY user_id) SELECT t.* FROM transactions t JOIN avg_amount a ON t.user_id = a.user_id WHERE t.status = 'completed' AND t.amount > a.avg_amount" }
```
---
## ๐ Observation Space
```json
{
"task_id": "syntax_fix_001",
"schema_sql": "CREATE TABLE orders (...)",
"current_query": "SELEC * FORM orders WERE amount > 500",
"error_message": "near \"SELEC\": syntax error",
"query_result": [],
"execution_plan": "",
"step_count": 0,
"target_description": "Return all orders where amount is greater than 500",
"reward_so_far": 0.0,
"available_tasks": ["syntax_fix_001", "logic_fix_001", "optimize_001"],
"done": false,
"reward": 0.05
}
```
---
## ๐ฐ Reward Function
The reward is computed from syntax validity, result correctness, and query plan quality.
| Event | Reward |
|---|---|
| Query fails with syntax error | `0.05` |
| Query runs successfully | contributes to the main score |
| Correct row match on easy and medium tasks | up to `0.6` of the score |
| Good query plan on hard task | up to `0.2` of the score |
| Uses correlated-subquery pattern on hard task | heavy plan penalty |
| Excessively long query | length penalty |
Final scores are clamped to the range `[0.0, 1.0]`.
---
## ๐ Setup & Usage
### Option 1 - Run Locally
```bash
pip install -e .
uvicorn server.app:app --host 0.0.0.0 --port 8000 --reload
# Open https://abhinavthedev-sql-debug.hf.space/docs
```
### Option 2 - Run with Docker
```bash
docker build -t sql-debug-env -f server/Dockerfile .
docker run -p 8000:8000 sql-debug-env
curl https://abhinavthedev-sql-debug.hf.space/health
```
### Option 3 - Run the Inference Loop
```bash
export SERVER_URL=https://abhinavthedev-sql-debug.hf.space
export API_KEY=sk-...
python inference.py
```
The inference script defaults to `syntax_fix_001`, logs each step, and stops when the episode ends or the step budget is reached.
---
## ๐๏ธ Project Structure
```text
sql_exp/
โโโ client.py # OpenEnv client wrapper
โโโ inference.py # LLM-driven inference loop
โโโ models.py # Action and observation models
โโโ openenv.yaml # OpenEnv manifest
โโโ pyproject.toml # Project metadata and dependencies
โโโ runner.py # SQLite query runner
โโโ server/
โ โโโ app.py # FastAPI app and OpenEnv wiring
โ โโโ Dockerfile # Container definition
โ โโโ sql_debug_environment.py # Core environment logic
โโโ tasks/
โ โโโ task_easy.py # Syntax-fix task
โ โโโ task_medium.py # Join logic task
โ โโโ task_hard.py # Query optimization task
โโโ graders/
โ โโโ grader_easy.py # Syntax-fix task
โ โโโ grader_medium.py # Join logic task
โ โโโ grader_hard.py # Query optimization task
โโโ README.md # Project overview
```
---
## ๐ ๏ธ Tech Stack
- **Python 3.10+** - Runtime
- **FastAPI** - HTTP framework
- **OpenEnv Core** - Environment server and client primitives
- **SQLite** - Query execution engine
- **Uvicorn** - ASGI server
- **Docker** - Containerization
---
## ๐ License
BSD-style license, matching the source headers in this repository.
|