nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated about 1 month ago • 795k • • 371
view article Article Building Tensors from Scratch in Rust (Part 1.2): View Operations KeighBee • Jun 18, 2025 • 4
Running 600 Scaling test-time compute 📈 600 Boost LLM answers with flexible test‑time search strategies
Search-R1 Collection Preliminary checkpoints with outcome-only RL. • 15 items • Updated Aug 12, 2025 • 18
Running Agents 430 Reward Bench Leaderboard 📐 430 Explore and compare model scores on RewardBench benchmarks
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16, 2025 • 169