Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling Paper • 2604.28075 • Published 6 days ago • 12
LLäMmlein 🐑 Collection https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/ • 10 items • Updated Mar 2 • 12