MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published 18 days ago • 51
DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs Paper • 2601.03559 • Published 23 days ago • 13
nvidia/nemotron-speech-streaming-en-0.6b Automatic Speech Recognition • Updated about 9 hours ago • 9.59k • 448
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 27 items • Updated 4 days ago • 136