| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | metrics: |
| | - accuracy |
| | pipeline_tag: text-generation |
| | tags: |
| | - code |
| | - math |
| | --- |
| | # MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs |
| |
|
| | This is a model for the paper "[MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs](https://arxiv.org/pdf/2402.16352.pdf)". |
| |
|
| |
|
| |
|
| |
|
| | ## News |
| |
|
| | - **[2024-02-26]** Our paper is now accessible at [ArXiv Paper](https://arxiv.org/pdf/2402.16352.pdf). |
| |
|
| | ## Introduction |
| |
|
| | Large language models (LLMs) have exhibited great potential in mathematical reasoning. However, there remains a performance gap in this area between existing open-source models and closed-source models such as GPT-4. |
| |
|
| | In this paper, we introduce **MathGenie**, a novel method for generating diverse and reliable math problems from a small-scale problem-solution dataset (denoted as *seed data*). We augment the ground-truth solutions of our seed data and train a back-translation model to translate the augmented solutions back into new questions. Subsequently, we generate code-integrated solutions for the new questions. To ensure the correctness of the code-integrated solutions, we employ rationale-based strategy for solution verification. |
| |
|
| | Various pretrained models, ranging from 7B to 70B, are trained on the newly curated data to test the effectiveness of the proposed augmentation technique, resulting in a family of models known as *MathGenieLM*. These models consistently outperform previous open-source models across five representative mathematical reasoning datasets, achieving state-of-the-art performance. In particular, MathGenieLM-InternLM2 achieves an accuracy of 87.7% on GSM8K and 55.7% on MATH, securing the best overall score among open-source language models. |
| |
|
| | You can refer to the [project homepage](https://mathgenie.github.io/) and [the paper](https://arxiv.org/pdf/2402.16352.pdf) for more details. |
| |
|
| | ## Usage |
| |
|
| | ### Models |
| |
|
| | Our [MathGenie-InterLM-20B](https://huggingface.co/MathGenie/MathGenie-InterLM-20B) model is available at Huggingface now. |
| | Our [MathGenie-Mixtral-8x7B](https://huggingface.co/MathGenie/MathGenie-Mixtral-8x7B) model is available at Huggingface now. |
| |
|
| | | Base Model | Model | |
| | | ------------ | ------------------------------------------------------------ | |
| | | InternLM-20B | [MathGenie-InterLM-20B](https://huggingface.co/MathGenie/MathGenie-InterLM-20B) | |
| | | Mixtral-8x7B | [MathGenie-Mixtral-8x7B](https://huggingface.co/MathGenie/MathGenie-Mixtral-8x7B) | |
| |
|
| | ### Inference & Evaluation |
| |
|
| | **template** |
| |
|
| | ``` |
| | {% for message in messages %} |
| | {% if message['role'] == 'user' %} |
| | {{ '<|user|>' }}{% elif message['role'] == 'system' %} |
| | {{ '<|system|>' }}{% elif message['role'] == 'assistant' %} |
| | {{ '<|assistant|>' }}{% endif %} |
| | {% for block in message['content'] %} |
| | {% if block['type'] == 'text' %} |
| | {{ '<|text|>' }}{% elif block['type'] == 'code' %} |
| | {{ '<|code|>' }}{% elif block['type'] == 'execution' %} |
| | {{ '<|execution|>' }}{% endif %} |
| | {{ block['content'] + '<|endofblock|>' }}{% endfor %} |
| | {{ '<|endofmessage|>' }}{% endfor %} |
| | ``` |
| |
|
| | Please refer to the [MathCoder repo](https://github.com/mathllm/MathCoder) for the detailed code for inference and evaluation of our MathGenieLM models. |
| |
|
| | ## Citation |
| |
|
| | If you find this paper helpful to your research, please kindly cite this BibTex: |
| |
|
| | ``` |
| | @misc{lu2024mathgenie, |
| | title={MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs}, |
| | author={Zimu Lu and Aojun Zhou and Houxing Ren and Ke Wang and Weikang Shi and Junting Pan and Mingjie Zhan and Hongsheng Li}, |
| | year={2024}, |
| | eprint={2402.16352}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL} |
| | } |
| | ``` |
| |
|
| | ``` |
| | @inproceedings{ |
| | wang2024mathcoder, |
| | title={MathCoder: Seamless Code Integration in {LLM}s for Enhanced Mathematical Reasoning}, |
| | author={Ke Wang and Houxing Ren and Aojun Zhou and Zimu Lu and Sichun Luo and Weikang Shi and Renrui Zhang and Linqi Song and Mingjie Zhan and Hongsheng Li}, |
| | booktitle={The Twelfth International Conference on Learning Representations}, |
| | year={2024}, |
| | url={https://openreview.net/forum?id=z8TW0ttBPp} |
| | } |
| | ``` |