| | --- |
| | language: |
| | - en |
| | tags: |
| | - text2sql |
| | - spider |
| | - Transformer |
| | - Pytorch |
| | license: mit |
| | --- |
| | ## Model Description |
| |
|
| | Graphix-T5 is a graph-aware semi-pretrained text-to-text PLM specifically designed to improve multi-hop reasoning for the complex text-to-SQL task. |
| | This novel architecture enhances the structural encoding capabilities of the T5 model while preserving its powerful contextual encoding ability. |
| | The experimental results demonstrate the effectiveness of GRAPHIX-T5 and underscore the importance of incorporating structural information in text-to-text PLMs for tackling intricate text-to-SQL challenges. |
| | The smaller gap in performance between the dev and test sets indicates the stronger generalization capability of Graphix-T5. |
| |
|
| | ## Training Data |
| | Graphix-3B is trained based on SPIDER, a cross-domain text-to-SQL benchmark. And it's evaluated in vanilla SPIDER dev, test, and other variants: SPIDER-SYN, SPIDER-DK, |
| | SPIDER-REALISTIC **without additional training**. This model will continue to be fine-tuned on more complex text-to-SQL data, |
| | i.e. BIRD to deal with harder but more real applications |
| |
|
| | ## To Begin With |
| |
|
| | You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run: |
| | ```py |
| | from transformers import AutoTokenizer, AutoModel |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("patrickNLP/Graphix-3B") |
| | |
| | model = AutoModel.from_pretrained("patrickNLP/Graphix-3B") |
| | ``` |
| |
|
| | ## Performance |
| | Graphix-3B w/ Picard maintains state-of-the-art (SOTA) semantic parsing capabilities, as demonstrated by its performance on the [`SPIDER`](https://yale-lily.github.io/spider) leaderboard. Its only submission achieves **74.0%** on EM and **77.6%** on EX in the testing dataset. |
| | Please see [`Graphix Official Implementation`]() for details. |
| |
|
| | ## Reference |
| | 1. [`Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing`](https://arxiv.org/abs/2301.07507) |
| | 2. [`Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs`](https://arxiv.org/abs/2305.03111) |
| | 3. [`Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task`](https://arxiv.org/abs/1809.08887) |
| | 4. [`PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models`](https://arxiv.org/abs/2109.05093) |
| |
|
| |
|
| | ## Citation |
| | ``` |
| | @misc{li2023graphixt5, |
| | title={Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing}, |
| | author={Jinyang Li and Binyuan Hui and Reynold Cheng and Bowen Qin and Chenhao Ma and Nan Huo and Fei Huang and Wenyu Du and Luo Si and Yongbin Li}, |
| | year={2023}, |
| | eprint={2301.07507}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL} |
| | } |
| | ``` |