koutch/paper_llama_llama3.1-8b_train_sft_all_train_code Text Generation • 8B • Updated about 12 hours ago
koutch/paper_llama_llama3.1-8b_train_sft_train_thought Text Generation • 8B • Updated about 13 hours ago
koutch/paper_llama_llama3.1-8b_train_sft_train_dual Text Generation • 8B • Updated about 13 hours ago • 87
koutch/paper_smol_smol3-3B_train_sft_all_train_code Text Generation • 3B • Updated about 14 hours ago
koutch/paper_llama_llama3.1-8b_train_sft_train_edit Text Generation • 8B • Updated about 15 hours ago • 85
koutch/paper_smol_smol3-3B_train_sft_train_dual Text Generation • 3B • Updated about 15 hours ago • 59
koutch/paper_qwen_qwen3-instruct-4b_train_sft_all_train_code Text Generation • 4B • Updated about 15 hours ago
koutch/paper_qwen_qwen3-instruct-4b_train_sft_train_thought Text Generation • 4B • Updated about 15 hours ago
koutch/paper_llama_llama3.1-8b_train_sft_train_code Text Generation • 8B • Updated about 15 hours ago • 84
koutch/paper_llama_llama3.1-8b_train_sft_train_para Text Generation • 8B • Updated about 15 hours ago • 129
koutch/paper_smol_smol3-3B_train_sft_train_edit Text Generation • 3B • Updated about 15 hours ago • 30
koutch/paper_smol_smol3-3B_train_sft_train_code Text Generation • 3B • Updated about 16 hours ago • 50
koutch/paper_qwen_qwen3-instruct-4b_train_sft_train_dual Text Generation • 4B • Updated about 16 hours ago • 52
koutch/paper_qwen_qwen3-instruct-4b_train_sft_train_edit Text Generation • 4B • Updated about 16 hours ago • 58
koutch/paper_smol_smol3-3B_train_sft_train_para Text Generation • 3B • Updated about 16 hours ago • 100
koutch/paper_qwen_qwen3-instruct-4b_train_sft_train_code Text Generation • 4B • Updated about 16 hours ago • 45
koutch/paper_qwen_qwen3-instruct-4b_train_sft_train_para Text Generation • 4B • Updated about 16 hours ago • 87
koutch/paper_qwen_qwen3-instruct-4b_train_sft_all_train_dual Text Generation • 4B • Updated 3 days ago • 61
koutch/paper_llama_llama3.1-8b_train_sft_all_train_dual Text Generation • 8B • Updated 3 days ago • 75
koutch/paper_qwen_qwen3-instruct-4b_train_sft_train_think Text Generation • 4B • Updated 10 days ago • 49
koutch/paper_llama_llama3.1-8b_train_sft_train_no_think Text Generation • 8B • Updated 10 days ago • 51
koutch/paper_qwen_qwen3-instruct-4b_train_sft_train_no_think Text Generation • 4B • Updated 10 days ago • 52