ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning
Paper • 2505.15776 • Published • 11
The base of this model is Llama-3.2-3B-Instruct, using QReCC as the training data, and the training method is ConvSearch-R1.