Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated Oct 1, 2025 • 332
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability Paper • 2506.01789 • Published Jun 2, 2025 • 15
ReasonIR: Training Retrievers for Reasoning Tasks Paper • 2504.20595 • Published Apr 29, 2025 • 54
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published Mar 10, 2025 • 101
The Same But Different: Structural Similarities and Differences in Multilingual Language Modeling Paper • 2410.09223 • Published Oct 11, 2024 • 5