Text Generation
Transformers
Safetensors
English
qwen2
sql
text-to-sql
qlora
unsloth
qwen2.5
database
natural-language-to-sql
conversational
text-generation-inference
Instructions to use sriksven/SQLForge-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sriksven/SQLForge-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="sriksven/SQLForge-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("sriksven/SQLForge-7B") model = AutoModelForCausalLM.from_pretrained("sriksven/SQLForge-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use sriksven/SQLForge-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "sriksven/SQLForge-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sriksven/SQLForge-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/sriksven/SQLForge-7B
- SGLang
How to use sriksven/SQLForge-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "sriksven/SQLForge-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sriksven/SQLForge-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "sriksven/SQLForge-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sriksven/SQLForge-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use sriksven/SQLForge-7B with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sriksven/SQLForge-7B to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sriksven/SQLForge-7B to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for sriksven/SQLForge-7B to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="sriksven/SQLForge-7B", max_seq_length=2048, ) - Docker Model Runner
How to use sriksven/SQLForge-7B with Docker Model Runner:
docker model run hf.co/sriksven/SQLForge-7B
File size: 4,345 Bytes
bf7c7f8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | ---
license: apache-2.0
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
- sql
- text-to-sql
- qlora
- unsloth
- qwen2.5
- database
- natural-language-to-sql
datasets:
- gretelai/synthetic_text_to_sql
language:
- en
pipeline_tag: text-generation
library_name: transformers
model-index:
- name: SQLForge-7B
results: []
---
# SQLForge-7B
A fine-tuned **Qwen2.5-7B-Instruct** model specialized for **natural language to SQL generation**. Given a database schema and a question in plain English, it writes the correct SQL query and explains what it does.
## Key Details
| | |
|---|---|
| **Base model** | Qwen/Qwen2.5-7B-Instruct |
| **Method** | QLoRA (4-bit NF4, rank 16, alpha 16) |
| **Library** | Unsloth + TRL SFTTrainer |
| **Dataset** | gretelai/synthetic_text_to_sql (10K examples from 100K) |
| **Hardware** | NVIDIA RTX A5000 (24GB VRAM) on RunPod |
| **Training time** | ~2.75 hours (500 steps) |
| **Final loss** | 0.414 |
| **Parameters trained** | 40.4M of 7.66B (0.53%) |
| **Format** | ChatML |
| **Output** | Merged 16-bit safetensors |
## Dataset
Trained on 10,000 examples from the [gretelai/synthetic_text_to_sql](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql) dataset, which covers 100 domains with a wide range of SQL complexity levels including subqueries, joins, aggregations, window functions, and set operations. Each example includes the database schema (CREATE TABLE statements), a natural language question, the correct SQL query, and an explanation.
## Usage
### Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("sriksven/SQLForge-7B")
tokenizer = AutoTokenizer.from_pretrained("sriksven/SQLForge-7B")
messages = [
{
"role": "system",
"content": "You are an expert SQL assistant. Given a database schema and a natural language question, write the correct SQL query and explain what it does.",
},
{
"role": "user",
"content": (
"Schema:\n"
"CREATE TABLE employees (id INT, name VARCHAR(100), department VARCHAR(50), salary DECIMAL(10,2));\n"
"CREATE TABLE departments (name VARCHAR(50), budget DECIMAL(12,2));\n\n"
"Question: What is the average salary by department, only showing departments with average salary above 75000?"
),
},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Unsloth (faster inference)
```python
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="sriksven/SQLForge-7B",
max_seq_length=2048,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
```
## SQL Complexity Coverage
The training data includes queries across multiple complexity levels:
- Simple SELECT with WHERE clauses
- Aggregations with GROUP BY and HAVING
- Single and multiple JOINs
- Subqueries and correlated subqueries
- Window functions (ROW_NUMBER, RANK, LAG, LEAD)
- Set operations (UNION, INTERSECT, EXCEPT)
- Data definition (CREATE, ALTER, INSERT)
## Intended Use
- Natural language interfaces to databases
- SQL copilot tools for analysts and developers
- Educational tools for learning SQL
- Prototyping data query systems
## Limitations
- Trained on synthetic data, not real production database queries
- May not handle highly domain-specific or proprietary SQL dialects
- Best with standard SQL syntax (PostgreSQL/MySQL style)
- Does not validate against a live database — SQL correctness is not guaranteed
- Long or deeply nested schemas may exceed the 2048 token context
## Training Infrastructure
| | |
|---|---|
| **GPU** | NVIDIA RTX A5000 24GB |
| **Cloud** | RunPod ($0.27/hr) |
| **Framework** | Unsloth 2026.5.2 + TRL + Transformers 5.5.0 |
| **Precision** | BF16 training, 4-bit NF4 base quantization |
| **Optimizer** | AdamW 8-bit |
| **Learning rate** | 2e-4, linear decay |
| **Batch size** | 16 effective (4 per device × 4 accumulation) |
| **Packing** | Enabled |
## Source Code
Training scripts: [github.com/sriksven/LLM-FineTune-Suite](https://github.com/sriksven/LLM-FineTune-Suite)
## License
Apache 2.0 |