Instructions to use sriksven/SQLForge-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use sriksven/SQLForge-7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="sriksven/SQLForge-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("sriksven/SQLForge-7B")
model = AutoModelForCausalLM.from_pretrained("sriksven/SQLForge-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use sriksven/SQLForge-7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "sriksven/SQLForge-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sriksven/SQLForge-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/sriksven/SQLForge-7B

SGLang

How to use sriksven/SQLForge-7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "sriksven/SQLForge-7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sriksven/SQLForge-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "sriksven/SQLForge-7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sriksven/SQLForge-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use sriksven/SQLForge-7B with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for sriksven/SQLForge-7B to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for sriksven/SQLForge-7B to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for sriksven/SQLForge-7B to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="sriksven/SQLForge-7B",
    max_seq_length=2048,
)

Docker Model Runner
How to use sriksven/SQLForge-7B with Docker Model Runner:
```
docker model run hf.co/sriksven/SQLForge-7B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

SQLForge-7B

A fine-tuned Qwen2.5-7B-Instruct model specialized for natural language to SQL generation. Given a database schema and a question in plain English, it writes the correct SQL query and explains what it does.

Key Details


Base model	Qwen/Qwen2.5-7B-Instruct
Method	QLoRA (4-bit NF4, rank 16, alpha 16)
Library	Unsloth + TRL SFTTrainer
Dataset	gretelai/synthetic_text_to_sql (10K examples from 100K)
Hardware	NVIDIA RTX A5000 (24GB VRAM) on RunPod
Training time	~2.75 hours (500 steps)
Final loss	0.414
Parameters trained	40.4M of 7.66B (0.53%)
Format	ChatML
Output	Merged 16-bit safetensors

Dataset

Trained on 10,000 examples from the gretelai/synthetic_text_to_sql dataset, which covers 100 domains with a wide range of SQL complexity levels including subqueries, joins, aggregations, window functions, and set operations. Each example includes the database schema (CREATE TABLE statements), a natural language question, the correct SQL query, and an explanation.

Usage

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("sriksven/SQLForge-7B")
tokenizer = AutoTokenizer.from_pretrained("sriksven/SQLForge-7B")

messages = [
    {
        "role": "system",
        "content": "You are an expert SQL assistant. Given a database schema and a natural language question, write the correct SQL query and explain what it does.",
    },
    {
        "role": "user",
        "content": (
            "Schema:\n"
            "CREATE TABLE employees (id INT, name VARCHAR(100), department VARCHAR(50), salary DECIMAL(10,2));\n"
            "CREATE TABLE departments (name VARCHAR(50), budget DECIMAL(12,2));\n\n"
            "Question: What is the average salary by department, only showing departments with average salary above 75000?"
        ),
    },
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Unsloth (faster inference)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="sriksven/SQLForge-7B",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

SQL Complexity Coverage

The training data includes queries across multiple complexity levels:

Simple SELECT with WHERE clauses
Aggregations with GROUP BY and HAVING
Single and multiple JOINs
Subqueries and correlated subqueries
Window functions (ROW_NUMBER, RANK, LAG, LEAD)
Set operations (UNION, INTERSECT, EXCEPT)
Data definition (CREATE, ALTER, INSERT)

Intended Use

Natural language interfaces to databases
SQL copilot tools for analysts and developers
Educational tools for learning SQL
Prototyping data query systems

Limitations

Trained on synthetic data, not real production database queries
May not handle highly domain-specific or proprietary SQL dialects
Best with standard SQL syntax (PostgreSQL/MySQL style)
Does not validate against a live database — SQL correctness is not guaranteed
Long or deeply nested schemas may exceed the 2048 token context

Training Infrastructure


GPU	NVIDIA RTX A5000 24GB
Cloud	RunPod ($0.27/hr)
Framework	Unsloth 2026.5.2 + TRL + Transformers 5.5.0
Precision	BF16 training, 4-bit NF4 base quantization
Optimizer	AdamW 8-bit
Learning rate	2e-4, linear decay
Batch size	16 effective (4 per device × 4 accumulation)
Packing	Enabled

Source Code

Training scripts: github.com/sriksven/LLM-FineTune-Suite

License

Apache 2.0

Downloads last month: 5

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for sriksven/SQLForge-7B

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct