---
license: apache-2.0
base_model: Qwen/Qwen3-4B-Thinking-2507
tags:
  - aster
  - reinforcement-learning
  - sft
  - reproduction
metrics:
  - accuracy
model-index:
  - name: ASTER_4B
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AIME 2025
          type: aime2025
        metrics:
          - name: Accuracy
            type: accuracy
            value: 87.7
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HMMT 2025 Feb
          type: hmmt_2025_feb
        metrics:
          - name: Accuracy
            type: accuracy
            value: 77.1
---

# ASTER_4B (Independent Reproduction)

[![Paper](https://img.shields.io/badge/Paper-ArXiv.2602.01204-B31B1B.svg)](https://arxiv.org/pdf/2602.01204)
[![GitHub](https://img.shields.io/badge/GitHub-Reproduction_Code-black)](https://github.com/Rainyrou/ASTER)
[![License](https://img.shields.io/badge/License-Apache_2.0-green.svg)](https://huggingface.co/datasets/choosealicense/licenses/apache-2.0)

## Model Description

**ASTER_4B** is an independent reproduction of the ASTER framework. This model is fine-tuned based on [Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507), strictly adhering to the experimental details and hyperparameter settings described in the original ASTER paper.

> ⚠️ **Note:** This is a **reproduction project**. We aim to verify the effectiveness of the ASTER method by strictly following the official paper's details.

## Training Data (SFT)

The model was trained using our reproduced dataset: **Aster_SFT4K**. 

This dataset serves as a tiny yet effective SFT set, constructed to replicate the exact data distribution and formatting used in the original ASTER experiments. You can find the dataset details here:
* **Dataset Repo:** [ASTER_SFT4K](https://huggingface.co/datasets/QuantumStackOverflow/ASTER_SFT4K)

## Evaluation Results

We evaluated the model's performance on challenging mathematical benchmarks. The evaluation was conducted under the **exact generation configuration** specified in the ASTER paper to ensure fair comparison.

**Generation Config:**
* **Temperature:** `1.0`
* **Top_p:** `1.0`
* **Max_context_length**: `96256`

| Benchmark | Score (%) |
| :--- | :--- |
| **AIME 2025** | **87.7** |
| **HMMT 2025 (Feb)** | **77.1** |