nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8 Text Generation • 124B • Updated 15 days ago • 336k • 234
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated 15 days ago • 626k • 340
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 305
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 310
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated 5 days ago • 274