Datasets for high quality small LM model pre-training.
George Grigorev
thepowerfuldeez
AI & ML interests
Building stuff with LLMs. Fine-tuning, context extension
Recent Activity
updated
a dataset
1 day ago
thepowerfuldeez/1226_imu1_base_decay_corpus
updated
a dataset
1 day ago
thepowerfuldeez/1218_imu1_base_stable_corpus
upvoted
a
paper
1 day ago
IMU-1: Sample-Efficient Pre-training of Small Language Models
Organizations
None yet