ASTER_4B (Independent Reproduction)
Model Description
ASTER_4B is an independent reproduction of the ASTER framework. This model is fine-tuned based on Qwen/Qwen3-4B-Thinking-2507, strictly adhering to the experimental details and hyperparameter settings described in the original ASTER paper.
⚠️ Note: This is a reproduction project. We aim to verify the effectiveness of the ASTER method by strictly following the official paper's details.
Training Data (SFT)
The model was trained using our reproduced dataset: Aster_SFT4K.
This dataset serves as a tiny yet effective SFT set, constructed to replicate the exact data distribution and formatting used in the original ASTER experiments. You can find the dataset details here:
- Dataset Repo: ASTER_SFT4K
Evaluation Results
We evaluated the model's performance on challenging mathematical benchmarks. The evaluation was conducted under the exact generation configuration specified in the ASTER paper to ensure fair comparison.
Generation Config:
- Temperature:
1.0 - Top_p:
1.0 - Max_context_length:
96256
| Benchmark | Score (%) |
|---|---|
| AIME 2025 | 87.7 |
| HMMT 2025 (Feb) | 77.1 |
- Downloads last month
- -
Model tree for QuantumStackOverflow/ASTER_4B_RL
Base model
Qwen/Qwen3-4B-Thinking-2507Paper for QuantumStackOverflow/ASTER_4B_RL
Evaluation results
- Accuracy on AIME 2025self-reported87.700
- Accuracy on HMMT 2025 Febself-reported77.100