twinkle-sqlcoder / README.md
whats2000's picture
Upload Devstral SQLCoder SFT model
4546fa9 verified
metadata
language:
  - en
license: other
base_model:
  - mistralai/Devstral-Small-2505
tags:
  - text-to-sql
  - sql
  - mistral
  - transformers
  - safetensors
pipeline_tag: text-generation
library_name: transformers

Devstral SQLCoder SFT

This model is a full-parameter SFT checkpoint for SQL generation, trained from mistralai/Devstral-Small-2505 and exported to Hugging Face safetensors format.

Model Details

  • Base model: mistralai/Devstral-Small-2505
  • Architecture: MistralForCausalLM
  • Precision used in training: bf16
  • Max sequence length (training config): 4096
  • Export format: sharded safetensors with model.safetensors.index.json

Training Data (Merged)

The SFT run merged the following datasets:

  • spider
  • bird
  • bird23-train-filtered
  • synsql-2.5m
  • wikisql
  • gretelai-synthetic
  • sql-create-context

Intended Use

  • Text-to-SQL research and experimentation
  • SQL generation benchmarks and evaluation pipelines

Limitations

  • This model may generate incorrect SQL and should be validated before production use.
  • Performance depends on prompt format, schema context quality, and decoding settings.
  • Evaluate safety and compliance requirements before deployment.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_or_path = "<hf-username-or-org>/<model-repo>"

tokenizer = AutoTokenizer.from_pretrained(repo_or_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_or_path,
    torch_dtype="bfloat16",
)

Local Files Included

  • config.json
  • generation_config.json
  • tekken.json
  • model-00001-of-00021.safetensors ... model-00021-of-00021.safetensors
  • model.safetensors.index.json

Citation

If you use this model, please cite this repository: