Running Featured 54 Distilling 100B+ Models 40x Faster with TRL π 54 TRL distillation for 100B+ teachers, 40x faster
view article Article Multimodal Embedding & Reranker Models with Sentence Transformers 7 days ago β’ 42
view article Article AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality Jan 21 β’ 33
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 Mar 10 β’ 124
Running on CPU Upgrade 220 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens π 220 Explore synthetic data experiments on a virtual bookshelf
Running on CPU Upgrade Featured 3.11k The Smol Training Playbook π 3.11k The secrets to building world-class LLMs
Running Featured 71 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems π 71 Who needs 1T parameters? Olympiad proofs with a 4B model
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper β’ 2510.14528 β’ Published Oct 16, 2025 β’ 124
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 May 24, 2023 β’ 176