ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning Paper • 2606.03503 • Published 3 days ago • 24
meta-llama/Llama-3.2-1B-Instruct Text Generation • 1B • Updated Oct 24, 2024 • 8.32M • • 1.46k
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency not-lain • Jan 30, 2025 • 342