DLM-Scope Collection Sparse Autoencoders of Diffusion Language Models (Dream-7B, LLaDA-8B) and Large Language Models (Qwen-2.5-7B, LLaMA-3-8B) • 6 items • Updated Feb 5 • 6
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-Base-BF16 Text Generation • 124B • Updated about 18 hours ago • 3.78k • 18
Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data Paper • 2602.21320 • Published 19 days ago • 12
Tool-R0 Collection Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data (https://arxiv.org/pdf/2602.21320) • 5 items • Updated 12 days ago • 1
Sleeping 1 Modular Addition Feature Learning 🔢 1 Explore modular addition neural network learning visualizations