MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier Paper • 2603.03756 • Published 7 days ago • 85
DPO-Shift: Shifting the Distribution of Direct Preference Optimization Paper • 2502.07599 • Published Feb 11, 2025 • 15
NoManDeRY/DPO-Shift-Llama-3-8B-Ultrafeedback-fixed-0.95 Text Generation • 8B • Updated Feb 18, 2025 • 1
NoManDeRY/DPO-Shift-Llama-3-8B-Ultrafeedback-decrease_linear-1.0to0.95 Text Generation • 8B • Updated Feb 18, 2025 • 8
NoManDeRY/DPO-Shift-Llama-3-8B-Ultrafeedback-increase_linear_0.95to1.0 Text Generation • 8B • Updated Feb 18, 2025 • 8
NoManDeRY/DPO-Shift-Qwen-2-7B-Ultrafeedback-fixed-1.0 Text Generation • 8B • Updated Feb 18, 2025 • 2
NoManDeRY/DPO-Shift-Qwen-2-7B-Ultrafeedback-fixed-0.95 Text Generation • 8B • Updated Feb 18, 2025 • 3
DPO-Shift: Shifting the Distribution of Direct Preference Optimization Paper • 2502.07599 • Published Feb 11, 2025 • 15
NoManDeRY/DPO-Shift-Llama-3-8B-Ultrafeedback-increase_linear_0.95to1.0 Text Generation • 8B • Updated Feb 18, 2025 • 8
NoManDeRY/DPO-Shift-Llama-3-8B-Ultrafeedback-decrease_linear-1.0to0.95 Text Generation • 8B • Updated Feb 18, 2025 • 8
NoManDeRY/DPO-Shift-Qwen-2-7B-Ultrafeedback-fixed-0.95 Text Generation • 8B • Updated Feb 18, 2025 • 3
NoManDeRY/DPO-Shift-Llama-3-8B-Ultrafeedback-fixed-0.95 Text Generation • 8B • Updated Feb 18, 2025 • 1
NoManDeRY/DPO-Shift-Qwen-2-7B-Ultrafeedback-fixed-1.0 Text Generation • 8B • Updated Feb 18, 2025 • 2