Model Card for Seirenes-4B -Based on Qwen3-4B-Instruct-2507

Adversarial Self-Play with Evolving Distractions for LLM Reasoning

Model Description

This is a math reasoning model.

  • Developed by: [Chi Zhang~1909zczc@gmail.com]
  • Finetuned from model: [Qwen3-4B-Instruct-2507]

Model Sources

  • Repository: [Seirenes]
  • Paper: [More Information Needed]

Uses

VLLM or Sglang

Training Details

Verl

Training Data

Dapo-17k

Citation [optional]

Downloads last month
-
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Rex1090/Seirenes_4B

Finetuned
(1699)
this model
Quantizations
1 model