FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning Paper ⢠2510.22543 ⢠Published Oct 26, 2025 ⢠14
SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning Paper ⢠2509.16548 ⢠Published Sep 20, 2025
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch Paper ⢠2410.18693 ⢠Published Oct 24, 2024 ⢠42
Rethinking Negative Instances for Generative Named Entity Recognition Paper ⢠2402.16602 ⢠Published Feb 26, 2024 ⢠3
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch Paper ⢠2309.10706 ⢠Published Sep 19, 2023 ⢠16