kuririrn/qwen3-4b-agent-trajectory-lora-sft_multi_dpo_merged Text Generation • 4B • Updated 22 days ago
kuririrn/qwen3-4b-agent-trajectory-SFT_alfadm2-prmcons_alformat1 Text Generation • 4B • Updated 25 days ago
kuririrn/qwen3-4b-agent-trajectory-SFT_alfadm-prmcons_alformat3 Text Generation • 4B • Updated 25 days ago
kuririrn/qwen3-4b-agent-trajectory-SFT_alfadm-prmcons_alformat2 Text Generation • 4B • Updated 25 days ago
kuririrn/qwen3-4b-agent-trajectory_2stageSFT_alfadm_dbweek Text Generation • 4B • Updated 26 days ago
kuririrn/qwen25-7b-agent-trajectory_alf_admissible-lora-constraint_gen-dist_allign_base Text Generation • 8B • Updated 26 days ago
kuririrn/qwen25-7b-agent-trajectory_alf_admissible-lora-constraint_gen-dist_allign_a Text Generation • 8B • Updated 26 days ago
kuririrn/qwen25-7b-agent-trajectory_alf_admissible-lora-constraint_gen-dist_allign_b Text Generation • 8B • Updated 26 days ago
kuririrn/qwen25-7b-agent-trajectory_alf_admissible-lora-constraint_gen-dist_allign Text Generation • 8B • Updated 27 days ago
kuririrn/qwen3-4b-agent-trajectory_alf_admissible-lora-constraint_gen-dist_allign_c Text Generation • 4B • Updated 27 days ago
kuririrn/qwen3-4b-agent-trajectory_alf_admissible-lora-constraint_gen-dist_allign_b Text Generation • 4B • Updated 27 days ago
kuririrn/qwen3-4b-agent-trajectory_alf_admissible-lora-constraint_gen-dist_allign_a Text Generation • 4B • Updated 27 days ago
kuririrn/qwen3-4b-agent-trajectory_alfadm_dbweek-lora-constraint_gen-dist_allign_v3 Text Generation • 4B • Updated 27 days ago
kuririrn/qwen3-4b-agent-trajectory_alfadm_dbweek-lora-constraint_gen-dist_allign_v2 Text Generation • 4B • Updated 27 days ago
kuririrn/qwen3-4b-agent-trajectory_alfadm_dbweek-lora-constraint_gen-dist_allign Text Generation • 4B • Updated 28 days ago
kuririrn/qwen3-4b-agent-trajectory_alf_admPlusExtra-lora-constraint_gen-dist_allign Text Generation • 4B • Updated 28 days ago
kuririrn/qwen3-4b-agent-trajectory_alf_admissible-lora-constraint_gen-dist_allign Text Generation • 4B • Updated 28 days ago
kuririrn/qwen2.5-7b-agent-trajectory-lora-constraint_gen-dist_allign_safe Text Generation • 8B • Updated about 1 month ago
kuririrn/qwen2.5-7b-agent-trajectory-lora-constraint_gen-dist_allign_LORAr32 Text Generation • 8B • Updated about 1 month ago
kuririrn/qwen2.5-7b-agent-trajectory-lora-constraint_gen-dist_allign Text Generation • 8B • Updated Feb 20 • 2
kuririrn/qwen3-4b-agent-trajectory-lora-constraint_gen-dist_allign Text Generation • 4B • Updated Feb 20
kuririrn/qwen3-4b-structured-output-lora-tuned_param_v2-without_cot Text Generation • Updated Feb 10 • 4