ASTER_4B (Independent Reproduction)

Paper GitHub License

Model Description

ASTER_4B is an independent reproduction of the ASTER framework. This model is fine-tuned based on Qwen/Qwen3-4B-Thinking-2507, strictly adhering to the experimental details and hyperparameter settings described in the original ASTER paper.

⚠️ Note: This is a reproduction project. We aim to verify the effectiveness of the ASTER method by strictly following the official paper's details.

Training Data (SFT)

The model was trained using our reproduced dataset: Aster_SFT4K.

This dataset serves as a tiny yet effective SFT set, constructed to replicate the exact data distribution and formatting used in the original ASTER experiments. You can find the dataset details here:

Evaluation Results

We evaluated the model's performance on challenging mathematical benchmarks. The evaluation was conducted under the exact generation configuration specified in the ASTER paper to ensure fair comparison.

Generation Config:

  • Temperature: 1.0
  • Top_p: 1.0
  • Max_context_length: 96256
Benchmark Score (%)
AIME 2025 87.7
HMMT 2025 (Feb) 77.1
Downloads last month
-
Safetensors
Model size
4B params
Tensor type
BF16
·
Video Preview
loading

Model tree for QuantumStackOverflow/ASTER_4B_RL

Finetuned
(190)
this model

Paper for QuantumStackOverflow/ASTER_4B_RL

Evaluation results