license: apache-2.0
base_model: Qwen/Qwen3-4B-Thinking-2507
tags:
- aster
- reinforcement-learning
- sft
- reproduction
metrics:
- accuracy
model-index:
- name: ASTER_4B
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AIME 2025
type: aime2025
metrics:
- name: Accuracy
type: accuracy
value: 87.7
- task:
type: text-generation
name: Text Generation
dataset:
name: HMMT 2025 Feb
type: hmmt_2025_feb
metrics:
- name: Accuracy
type: accuracy
value: 77.1
ASTER_4B (Independent Reproduction)
Model Description
ASTER_4B is an independent reproduction of the ASTER framework. This model is fine-tuned based on Qwen/Qwen3-4B-Thinking-2507, strictly adhering to the experimental details and hyperparameter settings described in the original ASTER paper.
⚠️ Note: This is a reproduction project. We aim to verify the effectiveness of the ASTER method by strictly following the official paper's details.
Training Data (SFT)
The model was trained using our reproduced dataset: Aster_SFT4K.
This dataset serves as a tiny yet effective SFT set, constructed to replicate the exact data distribution and formatting used in the original ASTER experiments. You can find the dataset details here:
- Dataset Repo: ASTER_SFT4K
Evaluation Results
We evaluated the model's performance on challenging mathematical benchmarks. The evaluation was conducted under the exact generation configuration specified in the ASTER paper to ensure fair comparison.
Generation Config:
- Temperature:
1.0 - Top_p:
1.0 - Max_context_length:
96256
| Benchmark | Score (%) |
|---|---|
| AIME 2025 | 87.7 |
| HMMT 2025 (Feb) | 77.1 |