File size: 4,289 Bytes
511fa5a a5a30ca 06eff58 f9774fd b0ccf3d a5a30ca 843b0b2 84ceb25 843b0b2 29d3d60 a5a30ca 436300e 843b0b2 436300e 843b0b2 436300e 843b0b2 436300e 843b0b2 dec4134 436300e dec4134 843b0b2 dec4134 843b0b2 dec4134 843b0b2 dec4134 056763e f9e684d 056763e a5a30ca |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
---
license: cc-by-4.0
library_name: transformers
tags:
- text-to-sql
- code
- qwen3
- knowledge-distillation
datasets:
- birdsql/bird_mini_dev # Links to the official BIRD Mini-dev dataset
- craterlabs/struct-sql-data # REPLACE this with your actual dataset ID
base_model:
- Qwen/Qwen3-4B-Instruct-2507
language:
- en
---
# Struct-SQL-8B: Knowledge Distillation with Structured Chain-of-Thought
**Struct-SQL** is a specialized Text-to-SQL model based on [**Qwen3-4B-Instruct-2507**](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507). It was trained using a novel Knowledge Distillation (KD) framework that transfers **structured reasoning** (Query Execution Plans) from a state-of-the-art teacher LLM (GPT-4o) to a smaller student model.
Unlike standard distillation methods that rely on unstructured Chain-of-Thought (CoT), Struct-SQL learns to generate a formal, logical blueprint (a query plan) before generating the final SQL. This approach significantly reduces syntactic errors and schema hallucinations.
📄 **Paper:** [Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL](https://arxiv.org/abs/2512.17053)
*(Accepted at Canadian AI Conference 2026)*
## Performance
On the **BIRD mini-dev** benchmark, Struct-SQL achieves an **Execution Accuracy (EX) of 45.0%**, outperforming standard unstructured CoT distillation baselines by **8.1 points**.
| Model | Distillation Method | Execution Accuracy (EX) |
|:---|:---|:---|
| **Struct-SQL (Ours)** | **Structured QP-CoT** | **45.0%** |
| ReasonSQL Baseline | Unstructured CoT | 36.9% |
| FN-Gold Baseline | No Reasoning (SQL Only) | 34.3% |
| Base Student (Zero-shot) | None | 17.0% |
---
## Methodology
The model was trained on a curated dataset of **1,000 samples** generated by GPT-4o. The training data consists of:
1. **Input:** Natural Language Question + Database Schema.
2. **Output:** A structured **Query Execution Plan** (Reasoning) + Final **SQL Query**.
By forcing the model to explicitly plan the query execution (e.g., "Scan Table", "Filter by...", "Join with..."), the model learns the logical structure of SQL generation rather than just memorizing patterns.
---
## Usage
You can use this model with the `transformers` library. It expects the input to be formatted with a specific system prompt or structure if you want to elicit the query plan.
---
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "craterlabs/Struct-SQL"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto"
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## Intended Use
Struct-SQL-4B is intended for **research and academic use** in tasks involving **Text-to-SQL generation** and **semantic parsing over relational databases**. The model is particularly suited for studying:
- Knowledge distillation techniques that leverage **structured intermediate representations**
- Explicit **query planning** as an alternative to unstructured chain-of-thought reasoning
- Error reduction in SQL generation, including syntactic validity and schema grounding
- Compact language models for complex reasoning under limited parameter budgets
The model is not optimized for direct deployment in production database systems without additional validation and safety constraints.
---
## Limitations
- Evaluation is confined to the SQLite-based BIRD benchmark
- The model may generate logically plausible but incorrect SQL for highly complex multi-hop queries
---
## Citation
```bibtex
@article{thaker2025knowledge,
title={Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL},
author={Thaker, Khushboo and Bresler, Yony},
journal={arXiv preprint arXiv:2512.17053},
year={2025}
}
@inproceedings{thaker2026knowledge,
title={Struct-SQL: Distilling Structured Reasoning for Small Text-to-SQL Models},
author={Thaker, Khushboo and Bresler, Yony},
booktitle={Proceedings of the 39th Canadian Conference on Artificial Intelligence},
year={2026},
note={To appear}
} |