ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
RoadMa
RoadQAQ
AI & ML interests
None yet
Organizations
models 8
RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero
Question Answering • 8B • Updated • 14
RoadQAQ/Qwen2.5-7B-think
Text Generation • 8B • Updated • 3
RoadQAQ/Qwen2.5-Math-1.5B-16k-think
Text Generation • 2B • Updated • 1.77k •
RoadQAQ/ReLIFT-Qwen2.5-7B-Zero
Question Answering • 8B • Updated • 6 • 2
RoadQAQ/Qwen2.5-Math-7B-16k-think
Text Generation • 8B • Updated • 2.47k
RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero
Question Answering • 2B • Updated • 11
RoadQAQ/OpenR1-Distill-7B
Updated
RoadQAQ/video_llm_template
Updated