view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 Feb 20 • 505
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 • 124
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand Dec 4, 2025 • 68
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 310
Inference Optimized Checkpoints (with Model Optimizer) Collection A collection of generative models quantized and optimized for inference with Model Optimizer. • 62 items • Updated 1 day ago • 148
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 514