AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents Paper • 2602.06855 • Published Feb 6 • 79
AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders Paper • 2602.05027 • Published Feb 4 • 62
SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization Paper • 2602.02383 • Published Feb 2 • 29
Multimodal Evaluation of Russian-language Architectures Paper • 2511.15552 • Published Nov 19, 2025 • 79
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story Paper • 2511.15210 • Published Nov 19, 2025 • 91
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17, 2025 • 139