nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-Base-BF16 Text Generation • 124B • Updated about 16 hours ago • 2.66k • 18
Sleeping 1 Modular Addition Feature Learning 🔢 1 Explore modular addition neural network learning visualizations
Running Featured 21 Chasing the Counting Manifold in Open LLMs 📚 21 Counting manifolds in open LLMs from behavior to SAEs.