Temperature-scaling surprisal estimates improve fit to human reading times -- but does it do so for the "right reasons"? Paper • 2311.09325 • Published Nov 15, 2023 • 1
Multimodal Pragmatic Jailbreak on Text-to-image Models Paper • 2409.19149 • Published Sep 27, 2024 • 1
FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings Paper • 2501.06645 • Published Jan 11, 2025 • 1
tongliuphysics/qwen3-4b-normal-n1-binary-rollout8-bs256-0201-real-step40 4B • Updated Jan 3 • 18 • 1
tongliuphysics/qwen3-4b-normal-n1-singleturn666-binary-rollout8-bs256-0401-step40 4B • Updated Jan 4 • 28 • 2
tongliuphysics/qwen3-4b-loopmultiturn3k-4096-rollout16-bs256-1201-fastv2-step40 4B • Updated Feb 11
tongliuphysics/qwen3-4b-loopmultiturn3k-4096-rollout16-bs256-1201-fastv2-step40 4B • Updated Feb 11
tongliuphysics/qwen3-4b-loopmultiturn3k-4096-rollout16-bs256-1201-fastv2-step20 4B • Updated Feb 11