·
AI & ML interests
None yet
Organizations
Anna4242/qwen25-7b-multihop-grpo-checkpoint-200
8B
•
Updated
•
2
Anna4242/qwen25-7b-singlehop-grpo-checkpoint-200
8B
•
Updated
•
2
Anna4242/qwen25-3b-instruct-grpo-merged
3B
•
Updated
•
2
Anna4242/qwen25-3b-base-grpo
Text Generation
•
Updated
•
2
Anna4242/qwen25-7b-full-sft-multihop
8B
•
Updated
•
2
Anna4242/qwen25-3b-full-sft-multihop
3B
•
Updated
•
2
Anna4242/qwen25-7b-sft-grpo-checkpoint-200
Reinforcement Learning
•
Updated
Anna4242/qwen25-3b-original-sft-ep1-grpo-checkpoint-200
Text Generation
•
Updated
•
2
Anna4242/Qwen2.5-7B-Instruct-onlyrl-step-1000
8B
•
Updated
•
2
Anna4242/Qwen2.5-7B-Instruct-Singlehop-SFT
8B
•
Updated
•
2
Anna4242/Qwen2.5-3B-Instruct-Singlehop-SFT
3B
•
Updated
•
2
Anna4242/Qwen2.5-1.5B-Instruct-Singlehop-SFT
2B
•
Updated
•
2
Anna4242/Qwen2.5-instruct-rl-only
8B
•
Updated
•
2
Anna4242/Singlehop-Qwen3-8b-final
8B
•
Updated
Anna4242/Singlehop-Qwen3-8b-epoch1
8B
•
Updated
Anna4242/Singlehop-Qwen3-1.7b-final
2B
•
Updated
Anna4242/Singlehop-Qwen3-1.7b-epoch1
2B
•
Updated
Anna4242/Singlehop-Qwen3-1.7b-epoch2
Updated
Anna4242/Multihop-Qwen3-8b-epoch2
8B
•
Updated
Anna4242/Multihop-Qwen3-8b-epoch1
8B
•
Updated
•
1
Anna4242/Singlehop-Qwen3-4b-epoch1
4B
•
Updated
Anna4242/Multihop-Qwen3-1.7b-final
2B
•
Updated
•
1
Anna4242/Multihop-Qwen3-1.7b-epoch1
2B
•
Updated
Anna4242/Multihop-Qwen3-4b-final
4B
•
Updated
Anna4242/Multihop-Qwen3-4b-epoch1
4B
•
Updated