MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome Paper • 2603.28407 • Published Mar 30 • 70
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published Mar 16 • 186
view article Article The N Implementation Details of RLHF with PPO +1 vwxyzjn, tianlinliu0121, lvwerra • Oct 24, 2023 • 72
LeanDojo Collection Machine learning for theorem proving in Lean: https://leandojo.org/ • 10 items • Updated Jul 23, 2024 • 2
LeanDojo Collection Machine learning for theorem proving in Lean: https://leandojo.org/ • 10 items • Updated Jul 23, 2024 • 2