YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

stage_specialized_recurrence1

Research artifact notice

This upload is a research artifact from recurrent-staged-loras; validate behavior before any downstream usage.

Run metadata

  • Base model: Qwen/Qwen3-8B
  • Baseline family: stage_specialized_recurrence
  • Recurrence mode: stage_specialized
  • Adapter settings: {"latent_refiner": {"adapter_sharing": "per_step", "enabled": true, "hidden_size": 0, "num_steps": 3, "recurrence_mode": "stage_specialized"}, "latent_refiner_adapter": {"alpha": 16, "dropout": 0.0, "enabled": true, "rank": 8, "target_modules": ["refiner_proj"]}, "standard_lora": {"alpha": 32, "dropout": 0.05, "enabled": false, "rank": 16, "target_modules": ["q_proj", "k_proj", "v_proj", "o_proj", "up_proj", "down_proj", "gate_proj"]}}
  • Dataset: metamath_qa split train
  • Training seed: 11

Loading

Primary weights are exported as Hugging Face-compatible safetensors (single-file or sharded with index). PyTorch checkpoint artifacts (checkpoint.pt) are removed after safetensors export+validation.

Downloads last month
134
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support