Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction Paper • 2508.03613 • Published Aug 5, 2025 • 14
Panacea: Pareto Alignment via Preference Adaptation for LLMs Paper • 2402.02030 • Published Feb 3, 2024 • 10
From Uncertainty to Trust: Enhancing Reliability in Vision-Language Models with Uncertainty-Guided Dropout Decoding Paper • 2412.06474 • Published Dec 9, 2024
Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper • 2412.16145 • Published Dec 20, 2024 • 38
Reranking-based Generation for Unbiased Perspective Summarization Paper • 2506.15925 • Published Jun 19, 2025 • 5
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? Paper • 2506.11928 • Published Jun 13, 2025 • 24
Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions Paper • 2502.04322 • Published Feb 6, 2025 • 3
Latent Space Interpretation for Stylistic Analysis and Explainable Authorship Attribution Paper • 2409.07072 • Published Sep 11, 2024
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations Paper • 2307.08678 • Published Jul 17, 2023
Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies Paper • 2305.12586 • Published May 21, 2023
Contrastive Loss is All You Need to Recover Analogies as Parallel Lines Paper • 2306.08221 • Published Jun 14, 2023