Comm-C Qwen3-8B Thinking SFT

This checkpoint is a short 20-step SFT run based on Qwen3-8B.

Training target format:

<think>
reasoning_content
</think>
final response

Training summary:

  • Base model: Qwen3-8B
  • Data: synthetic Comm-C distillation data from GLM-5.2
  • Train rows: 4243
  • Validation rows: 99
  • Max length: 32768
  • Global batch size: 64
  • Learning rate: 2e-5
  • Training steps: 20
  • Output format: Hugging Face safetensors shards

This checkpoint is intended as a quick sanity-check artifact for thinking-format SFT, not a final converged model.

Downloads last month
-
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support