Spaces:
Sleeping
Sleeping
| title: SQL Query Optimizer Environment Server | |
| emoji: π³ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| pinned: false | |
| app_port: 7860 | |
| base_path: /web | |
| tags: | |
| - openenv | |
| # SQL Query Optimizer β OpenEnv Environment | |
| An **OpenEnv-compliant** environment where AI agents learn to review, rewrite, and optimise SQL queries across three real-world failure patterns. | |
| > **HF Spaces**: [param20h/sql-query-optimizer](https://huggingface.co/spaces/param20h/sql-query-optimizer) | |
| --- | |
| ## Environment Description | |
| Real-world SQL anti-patterns cost companies millions in infrastructure. This environment teaches agents to identify and fix them through a reward-shaped episode loop. Each episode presents the agent with a broken or unoptimised query alongside schema context; the agent iteratively rewrites it until done or max steps are reached. | |
| **Why this domain?** | |
| - Used by data engineers and DBAs every day | |
| - Deterministically gradeable (no ambiguous LLM judging) | |
| - Natural difficulty progression from syntax errors to multi-factor optimisation | |
| --- | |
| ## Observation Space | |
| | Field | Type | Description | | |
| |---|---|---| | |
| | `task_id` | `int` | Task number (1β3) | | |
| | `task_name` | `str` | Slug identifier | | |
| | `task_description` | `str` | What the agent must accomplish | | |
| | `query` | `str` | The SQL to fix | | |
| | `schema_context` | `str` | Relevant DDL / table definitions | | |
| | `hint` | `str \| null` | Optional hint (tasks 1 & 2 only) | | |
| | `step_number` | `int` | Current step (0-indexed) | | |
| | `max_steps` | `int` | Steps allowed per episode | | |
| | `done` | `bool` | Whether episode has ended | | |
| --- | |
| ## Action Space | |
| | Field | Type | Description | | |
| |---|---|---| | |
| | `rewritten_query` | `str` | The agent's improved SQL | | |
| | `explanation` | `str` | Brief description of changes made | | |
| | `is_done` | `bool` | `true` when the agent believes the query is fully fixed | | |
| --- | |
| ## Reward Design | |
| The reward is **shaped** (not sparse) β the agent receives signal every step: | |
| | Component | Value | Trigger | | |
| |---|---|---| | |
| | Delta reward | +0.0β0.50 Γ Ξgrader | Grader score improves | | |
| | Completion bonus | +0.50 | `is_done=True` and grader β₯ 0.80 | | |
| | Partial completion | +grader Γ 0.30 | `is_done=True` (always) | | |
| | Step penalty | β0.02 / step | After halfway point, if not done | | |
| | Invalid penalty | β0.10 | Empty or unparseable query | | |
| Final `score` per step is clamped to `[0.0, 1.0]`. | |
| --- | |
| ## Tasks | |
| ### Task 1 β `fix-broken-join` (Easy) | |
| The query uses a comma-separated cross-join (`FROM orders, customers`) without any join condition, causing a Cartesian product. The agent must rewrite with `INNER JOIN β¦ ON o.customer_id = c.customer_id`. | |
| **Max steps**: 3 | **Grader**: checks JOIN keyword + ON clause with correct key | |
| ### Task 2 β `eliminate-n-plus-one` (Medium) | |
| A correlated scalar subquery in the `SELECT` list executes once per row (N+1 problem). The agent must collapse it into a single `LEFT JOIN departments ON e.dept_id = d.dept_id`. | |
| **Max steps**: 4 | **Grader**: checks subquery removal + JOIN on dept_id | |
| ### Task 3 β `full-optimization` (Hard) | |
| Four independent issues to fix: | |
| 1. Remove redundant `DISTINCT` (PK join makes it unnecessary) | |
| 2. Replace `SELECT *` with explicit columns | |
| 3. Replace `CAST(price AS VARCHAR) LIKE '1%'` β `price >= 100 AND price < 200` (sargable) | |
| 4. Add an index hint comment for `(category, price)` | |
| **Max steps**: 5 | **Grader**: 4 Γ 0.25 sub-criteria, fully independent | |
| --- | |
| ## API Endpoints | |
| | Method | Path | Description | | |
| |---|---|---| | |
| | `GET` | `/` | Health check | | |
| | `POST` | `/reset` | Start episode `{ "task_id": 1 }` | | |
| | `POST` | `/step` | Submit action `{ "rewritten_query": "...", "explanation": "...", "is_done": true }` | | |
| | `GET` | `/state` | Current internal state | | |
| | `GET` | `/tasks` | All tasks + action schema | | |
| | `GET` | `/grader` | Grader score for current episode | | |
| | `POST` | `/baseline` | Run baseline inference (requires `OPENAI_API_KEY`) | | |
| Interactive docs: `http://localhost:7860/docs` | |
| --- | |
| ## Setup & Usage | |
| ### Prerequisites | |
| - Python 3.10+ | |
| - Docker | |
| - `API_BASE_URL` (OpenAI-compatible endpoint for inference) | |
| - `MODEL_NAME` (model identifier for inference) | |
| - `HF_TOKEN` (API key / bearer token for inference) | |
| ### Local (Python) | |
| ```bash | |
| pip install -r requirements.txt | |
| uvicorn server:app --host 0.0.0.0 --port 7860 --reload | |
| ``` | |
| ### Local (Docker) | |
| ```bash | |
| docker build -t sql-optimizer-env . | |
| docker run -p 7860:7860 -e OPENAI_API_KEY=sk-... sql-optimizer-env | |
| ``` | |
| ### Baseline Inference | |
| ```bash | |
| $env:API_BASE_URL="https://api.openai.com/v1" | |
| $env:MODEL_NAME="gpt-4o-mini" | |
| $env:HF_TOKEN="hf_or_openai_api_key_here" | |
| python inference.py | |
| ``` | |
| ### OpenEnv Validation | |
| ```bash | |
| pip install openenv-core | |
| openenv validate | |
| ``` | |
| ### Deploy to HF Spaces | |
| ```bash | |
| pip install huggingface_hub | |
| huggingface-cli login | |
| openenv push --repo-id your-username/sql-query-optimizer | |
| ``` | |
| ### Environment Configuration | |
| Define these variables before running inference or `/baseline`: | |
| ```powershell | |
| $env:API_BASE_URL = "https://api.openai.com/v1" | |
| $env:MODEL_NAME = "gpt-4o-mini" | |
| $env:HF_TOKEN = "your_api_key" | |
| ``` | |
| --- | |
| ## Baseline Scores | |
| Measured with `gpt-4o-mini` at `temperature=0`, single-pass: | |
| | Task | Name | Difficulty | Grader Score | | |
| |---|---|---|---| | |
| | 1 | fix-broken-join | Easy | 0.86 | | |
| | 2 | eliminate-n-plus-one | Medium | 0.72 | | |
| | 3 | full-optimization | Hard | 0.50 | | |
| | β | **Average** | β | **0.69** | | |
| > Scores are reproducible: same model, same temperature, same grader β same output. | |
| --- | |
| ## Project Structure | |
| ``` | |
| metaXscaler/ | |
| βββ env/ | |
| β βββ __init__.py | |
| β βββ environment.py # reset(), step(), state() | |
| β βββ models.py # Observation, Action, Reward (Pydantic) | |
| β βββ tasks.py # Task definitions + graders | |
| β βββ reward.py # Shaped reward function | |
| βββ server.py # FastAPI app | |
| βββ baseline.py # Baseline inference script | |
| βββ openenv.yaml # OpenEnv spec metadata | |
| βββ Dockerfile | |
| βββ requirements.txt | |
| βββ README.md | |
| ``` | |
| --- | |
| ## License | |
| MIT | |