whats2000

Upload Devstral SQLCoder SFT model

4546fa9 verified about 14 hours ago

1.82 kB

language:
  - en
license: other
base_model:
  - mistralai/Devstral-Small-2505
tags:
  - text-to-sql
  - sql
  - mistral
  - transformers
  - safetensors
pipeline_tag: text-generation
library_name: transformers

Devstral SQLCoder SFT

This model is a full-parameter SFT checkpoint for SQL generation, trained from mistralai/Devstral-Small-2505 and exported to Hugging Face safetensors format.

Model Details

Base model: mistralai/Devstral-Small-2505
Architecture: MistralForCausalLM
Precision used in training: bf16
Max sequence length (training config): 4096
Export format: sharded safetensors with model.safetensors.index.json

Training Data (Merged)

The SFT run merged the following datasets:

spider
bird
bird23-train-filtered
synsql-2.5m
wikisql
gretelai-synthetic
sql-create-context

Intended Use

Text-to-SQL research and experimentation
SQL generation benchmarks and evaluation pipelines

Limitations

This model may generate incorrect SQL and should be validated before production use.
Performance depends on prompt format, schema context quality, and decoding settings.
Evaluate safety and compliance requirements before deployment.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_or_path = "<hf-username-or-org>/<model-repo>"

tokenizer = AutoTokenizer.from_pretrained(repo_or_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_or_path,
    torch_dtype="bfloat16",
)

Local Files Included

config.json
generation_config.json
tekken.json
model-00001-of-00021.safetensors ... model-00021-of-00021.safetensors
model.safetensors.index.json

Citation

If you use this model, please cite this repository:

https://github.com/ai-twinkle/twinkle-sqlcoder