What kind of training data used in the RL process of R1 Zero?
#14
by RitchieLeung - opened
Thanks the awesome job of DeepSeek, I got a question while I read the technique report:
what kind of training data used in the RL process of R1 Zero?
Thanks the awesome job of DeepSeek, I got a question while I read the technique report:
what kind of training data used in the RL process of R1 Zero?