| | --- |
| | library_name: transformers |
| | tags: |
| | - text-to-sql |
| | - sql |
| | license: apache-2.0 |
| | language: |
| | - en |
| | base_model: |
| | - Qwen/Qwen2.5-Coder-7B-Instruct |
| | datasets: |
| | - AlioLeuchtmann/BIRD_SPIDER_Qwen2.5-Coder-32B |
| | --- |
| | |
| | BIRD-bench: https://bird-bench.github.io/ |
| |
|
| | ## 57% on BIRD private test Set (Qwen 2.5 Coder Models might have been trained on BIRD dev, which could explain the Performance Drop) |
| | ## 60% on BIRD dev one shot |
| | ## 70% on BIRD dev 4 shot |
| |
|
| | Offical Results on BIRD Test Set |
| |
|
| | | Metric | Simple | Moderate | Challenging | Total | |
| | | ----------- | -----: | -------: | ----------: | ----: | |
| | | **Count** | 949 | 555 | 285 | 1789 | |
| | | **EX** | 65.33 | 53.33 | 39.65 | 57.52 | |
| | | **Soft F1** | 66.31 | 55.51 | 41.48 | 59.01 | |
| | | **R-VES** | 61.15 | 47.73 | 36.71 | 53.09 | |
| |
|
| |
|
| |
|
| | ## Model Details |
| |
|
| | SFT of Qwen2.5-Coder-7B <br> |
| | Dataset is a combination of BIRD and Spider; answers were created through knowledge distillation with CoT prompting from Qwen2.5-Coder-32B, and filtered for correctness. <br> |
| | <br> |
| | We increased the BIRD dev set performance from 51.1% to 60%. <br> |
| | Our model beats much larger universal models like GPT-4o and Gemini 1.5 Pro, demonstrating the effectiveness of black-box knowledge distillation for constrained domains. <br> |
| | After training, this checkpoint even exceeds the performance of its teacher model, Qwen2.5-Coder-32B, which can be achieved by filtering the teacher’s outputs for Quality <br> |
| |
|
| |
|
| |
|
| |  |
| |
|
| |
|
| | ## Prompt: |
| |
|
| | Single Prompt used throughout the Dataset so best used as follows: |
| |
|
| | ```sql |
| | CREATE TABLE geographic\n(\n city TEXT not null\n primary key,\n county TEXT null,\n region TEXT null\n) \n /* \n 2 example rows: \n SELECT * FROM geographic LIMIT 2; \n city county region \nalameda alameda county bay area \n alamo contra costa county bay area \n */\n\n <br> |
| | CREATE TABLE generalinfo\n(\n id_restaurant INTEGER not null\n primary key,\n label TEXT null,\n food_type TEXT null,\n city TEXT null,\n review REAL null,\n foreign key (city) references geographic(city)\n on update cascade on delete cascade\n) \n /* \n 2 example rows: \n SELECT * FROM generalinfo LIMIT 2; \n id_restaurant label food_type city review \n 1 sparky's diner 24 hour diner san francisco 2.3 \n 2 kabul afghan cuisine afghani san carlos 3.8 \n */\n\nCREATE TABLE location\n(\n id_restaurant INTEGER not null\n primary key,\n street_num INTEGER null,\n street_name TEXT null,\n city TEXT null,\n foreign key (city) references geographic (city)\n on update cascade on delete cascade,\n foreign key (id_restaurant) references generalinfo (id_restaurant)\n on update cascade on delete cascade\n) \n /* \n 2 example rows: \n SELECT * FROM location LIMIT 2; \n id_restaurant street_num street_name city \n 1 242 church st san francisco \n 2 135 el camino real san carlos \n */\n\n <br> |
| | -- External Knowledge: Atlantic Ave refers to street_name = 'atlantic ave'; rating refers to review\n <br> |
| | -- Using valid SQLite and understanding External Knowledge, answer the following question for the tables provided above.\n <br> |
| | -- What is the rating of each restaurant reviews on Atlantic Ave?\n <br> |
| | Generate the SQL after thinking step by step:\n <br> |
| | ``` |
| |
|
| | ```python |
| | def bird_gpt_template_no_format(question, commonsense, schema): |
| | return f"""{schema} |
| | |
| | -- External Knowledge: {commonsense} |
| | -- Using valid SQLite and understanding External Knowledge, answer the following question for the tables provided above. |
| | -- {question} |
| | Generate the SQL after thinking step by step: |
| | """ |
| | |
| | ``` |
| |
|
| | ### Generate Schema: |
| |
|
| | ```python |
| | def generate_schema_prompt(db_path, num_rows=None): |
| | # extract create ddls |
| | ''' |
| | :param root_place: |
| | :param db_name: |
| | :return: |
| | ''' |
| | full_schema_prompt_list = [] |
| | conn = sqlite3.connect(db_path) |
| | # Create a cursor object |
| | cursor = conn.cursor() |
| | cursor.execute("SELECT name FROM sqlite_master WHERE type='table'") |
| | tables = cursor.fetchall() |
| | schemas = {} |
| | for table in tables: |
| | if table == 'sqlite_sequence': |
| | continue |
| | cursor.execute("SELECT sql FROM sqlite_master WHERE type='table' AND name='{}';".format(table[0])) |
| | create_prompt = cursor.fetchone()[0] |
| | schemas[table[0]] = create_prompt |
| | if num_rows: |
| | cur_table = table[0] |
| | if cur_table in ['order', 'by', 'group','transaction'] or ' ' in str(cur_table).strip() or '-' in str(cur_table).strip(): |
| | cur_table = '"{}"'.format(cur_table) |
| | |
| | |
| | cursor.execute("SELECT * FROM {} LIMIT {}".format(cur_table, num_rows)) |
| | column_names = [description[0] for description in cursor.description] |
| | values = cursor.fetchall() |
| | rows_prompt = nice_look_table(column_names=column_names, values=values) |
| | verbose_prompt = "/* \n {} example rows: \n SELECT * FROM {} LIMIT {}; \n {} \n */".format(num_rows, |
| | cur_table, |
| | num_rows, |
| | rows_prompt) |
| | schemas[table[0]] = "{} \n {}".format(create_prompt, verbose_prompt) |
| | |
| | for k, v in schemas.items(): |
| | full_schema_prompt_list.append(v) |
| | |
| | schema_prompt = "\n\n".join(full_schema_prompt_list) |
| | |
| | return schema_prompt |
| | ``` |
| |
|
| |
|
| | ### System Prompt: |
| |
|
| | Default Qwen2.5 System Prompt |
| |
|
| | ```python |
| | def preprocess_prompt(prompt): |
| | return f'''<|im_start|>system |
| | You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|> |
| | <|im_start|>user |
| | {prompt}<|im_end|> |
| | <|im_start|>assistant |
| | ''' |
| | ``` |
| |
|
| | ### Generation Config: |
| | - No Sampling for best Performance |
| |
|
| |
|
| |
|
| | ### Restrictions: |
| | - only trained on SQLite Dialect |
| | - only trained in English |
| | - Did not care to keep any other Skills than Text to SQL |
| |
|
| |
|
| |
|