ajagota71/gemma-3-270m-detox-checkpoint-epoch-100 Reinforcement Learning • 0.3B • Updated Aug 16, 2025 • 1
ajagota71/gemma-3-270m-detox-checkpoint-epoch-80 Reinforcement Learning • 0.3B • Updated Aug 16, 2025 • 1
ajagota71/Qwen2.5-0.5B-detox-checkpoint-epoch-100 Reinforcement Learning • 0.5B • Updated Aug 15, 2025 • 1
ajagota71/gemma-3-270m-detox-checkpoint-epoch-60 Reinforcement Learning • 0.3B • Updated Aug 15, 2025 • 1
ajagota71/Qwen2.5-0.5B-detox-checkpoint-epoch-80 Reinforcement Learning • 0.5B • Updated Aug 15, 2025 • 1
ajagota71/gemma-3-270m-detox-checkpoint-epoch-40 Reinforcement Learning • 0.3B • Updated Aug 15, 2025 • 1
ajagota71/Qwen2.5-0.5B-detox-checkpoint-epoch-60 Reinforcement Learning • 0.5B • Updated Aug 15, 2025 • 1
ajagota71/Qwen2.5-0.5B-detox-checkpoint-epoch-40 Reinforcement Learning • 0.5B • Updated Aug 15, 2025
ajagota71/gemma-3-270m-detox-checkpoint-epoch-20 Reinforcement Learning • 0.3B • Updated Aug 15, 2025
ajagota71/Qwen2.5-0.5B-detox-checkpoint-epoch-20 Reinforcement Learning • 0.5B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-360M-detox-checkpoint-epoch-100 Reinforcement Learning • 0.4B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-360M-detox-checkpoint-epoch-80 Reinforcement Learning • 0.4B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-135M-detox-checkpoint-epoch-100 Reinforcement Learning • 0.1B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-135M-detox-checkpoint-epoch-80 Reinforcement Learning • 0.1B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-360M-detox-checkpoint-epoch-60 Reinforcement Learning • 0.4B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-135M-detox-checkpoint-epoch-60 Reinforcement Learning • 0.1B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-360M-detox-checkpoint-epoch-40 Reinforcement Learning • 0.4B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-135M-detox-checkpoint-epoch-40 Reinforcement Learning • 0.1B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-360M-detox-checkpoint-epoch-20 Reinforcement Learning • 0.4B • Updated Aug 15, 2025 • 1
ajagota71/SmolLM2-135M-detox-checkpoint-epoch-20 Reinforcement Learning • 0.1B • Updated Aug 15, 2025 • 1