nvidia/llama-nemotron-rerank-1b-v2 Text Ranking • 1B • Updated about 16 hours ago • 77.4k • 34
Snowflake/snowflake-arctic-embed-m-v2.0 Sentence Similarity • Updated Apr 24, 2025 • 76.4k • 102
view article Article Making LLMs Smaller Without Breaking Them: A GLU-Aware Pruning Approach Nov 24, 2024 • 20
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging Paper • 2406.16330 • Published Jun 24, 2024 • 1