nishu08's picture
Deploy CodeBERT training Space
8464aea verified
|
Raw
History Blame Contribute Delete
1.27 kB
---
title: SQL Error Classifier Training
emoji: 🧠
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
license: mit
hardware: t4-small
---
# SQL Error Classifier β€” CodeBERT Training Space
Train `microsoft/codebert-base` as a **cross-encoder** for multi-label SQL error classification.
## Setup
1. **Hardware:** Settings β†’ Hardware β†’ **GPU t4-small** (recommended)
2. **Secrets:** Settings β†’ Secrets β†’ add `HF_TOKEN` (Hugging Face write token) to push models to your account
3. **Data:** Include `data/sql_errors_dev.parquet` in this Space repo, or upload parquet at runtime
## Usage
1. Choose bundled dataset or upload your own parquet
2. Set epochs, batch size, max samples
3. Click **Start Training**
4. Optionally enable **Push to Hub** with model id `your-username/sql-codebert-classifier`
## Dataset columns
Required (aliases supported):
| Column | Aliases |
|--------|---------|
| `question` | β€” |
| `schema` | β€” |
| `student_sql` | `query` |
| `correct_sql` | `correct_query` |
| `error_labels` | `label_name` |
## Labels (9-class multi-label)
`JOIN_ERROR`, `AGGREGATION_ERROR`, `FILTER_ERROR`, `WINDOW_FUNCTION_ERROR`,
`SUBQUERY_ERROR`, `NULL_HANDLING_ERROR`, `PERFORMANCE_ERROR`, `LOGICAL_ERROR`, `SYNTAX_ERROR`