| --- |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - lean4 |
| - theorem-proving |
| - verification |
| - code-verification |
| base_model: Qwen/Qwen3-8B |
| --- |
| |
| # Goedel-Code-Prover-8B |
|
|
| **Paper:** [Goedel-Code-Prover: Hierarchical Proof Search for Open State-of-the-Art Code Verification](https://arxiv.org/abs/2603.19329) |
| **Code:** [https://github.com/goedelcodeprover/Goedel-Code-Prover](https://github.com/goedelcodeprover/Goedel-Code-Prover) |
|
|
| Goedel-Code-Prover-8B is a Lean 4 proof generation model specialized in **proof decomposition** for code correctness verification. Given a formal specification in Lean 4, it decomposes the proof into smaller lemmas and proves the target theorem by combining them. |
|
|
| ## Use Case |
|
|
| This model is designed to work within a **decompose-then-verify scaffolding**: |
|
|
| 1. **Input:** A Lean 4 formal problem (theorem statement with `sorry` body). |
| 2. **Decomposition:** The model breaks down the proof into small, self-contained lemmas, each targeting a single logical step. |
| 3. **Proof:** The model proves the target theorem by composing the introduced lemmas, ideally without any `sorry`. |
| 4. **Verification:** The generated proof is compiled against a Lean 4 server to check correctness. |
|
|
| ## Prompt Format |
|
|
| Use the chat completions API with the following structure: |
|
|
| **System prompt:** |
| ``` |
| You are an expert in Lean 4 theorem proving and proof decomposition. |
| |
| Given a formal problem in Lean 4, your task is to: |
| 1. First, analyze and reason about how to decompose the proof into smaller lemmas |
| 2. Then, provide the complete proof with all necessary lemmas |
| |
| Requirements for the proof breakdown: |
| - Break down the theorem into the smallest possible lemmas |
| - Each lemma should ideally involve a single Lean 4 basic function operation |
| - The introduced lemmas should not involve universe types |
| - Ensure logical ordering: lemmas should be defined before they are used |
| |
| Requirements for the format of the proof: |
| - The original theorem MUST be proved WITHOUT sorry (using the defined lemmas) |
| - Each lemma should be proved with a complete proof if possible |
| - Use the 'lemma' keyword for lemmas and the 'theorem' keyword for the original theorem |
| - Wrap all proof code in ```lean code blocks |
| ``` |
| |
| **User message:** |
| ``` |
| Formal Problem: |
| <lean4 theorem statement> |
| ``` |
| |
| ## Model Details |
| |
| - **Base model:** Qwen3-8B |
| - **Training:** Supervised fine-tuning on proof decomposition and completion trajectories, followed by GRPO reinforcement learning with online Lean 4 verification rewards. |
| |
| ## Citation |
| |
| ```bibtex |
| @misc{li2026goedelcodeproverhierarchicalproofsearch, |
| title={Goedel-Code-Prover: Hierarchical Proof Search for Open State-of-the-Art Code Verification}, |
| author={Zenan Li and Ziran Yang and Deyuan and He and Haoyu Zhao and Andrew Zhao and Shange Tang and Kaiyu Yang and Aarti Gupta and Zhendong Su and Chi Jin}, |
| year={2026}, |
| eprint={2603.19329}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.SE}, |
| url={https://arxiv.org/abs/2603.19329}, |
| } |
| ``` |
| |