AI & ML interests
None defined yet.
Recent Activity
textcleanlm/essentialweb-1.0-10B-clean-content
Viewer
• Updated • 9.32M • 42
textcleanlm/essentialweb-1.0-10B-raw-content
Viewer
• Updated • 9.32M • 53
textcleanlm/essentialweb-1.0-sample-10B
Viewer
• Updated • 9.32M • 98
Viewer
• Updated • 2.98M • 19
textcleanlm/med-domain-5b
Viewer
• Updated • 4.07M • 19
textcleanlm/med-domain-data-sample1
Viewer
• Updated • 814k • 9
textcleanlm/med-domain-data-sample
Viewer
• Updated • 8.1k • 10
textcleanlm/fineweb-sample-10BT
Viewer
• Updated • 14.9M • 40
textcleanlm/training-data-2
Viewer
• Updated • 66.3k • 37
textcleanlm/textclean-10B
Viewer
• Updated • 9.77M • 180
textcleanlm/textclean-2B-raw-cleaned
Viewer
• Updated • 1.95M • 20
textcleanlm/textclean-2B-raw-sample
Viewer
• Updated • 100 • 7
textcleanlm/textclean-2B-raw
Viewer
• Updated • 1.97M • 9
textcleanlm/textclean-sft
Viewer
• Updated • 894k • 6
Viewer
• Updated • 91.7k • 6
textcleanlm/textclean-200M
Viewer
• Updated • 581k • 7
textcleanlm/100M-raw-webtext-to-denoised-text
Viewer
• Updated • 179k • 100
textcleanlm/annotation_example
Viewer
• Updated • 1.82k • 79
Viewer
• Updated • 1.82k • 82
textcleanlm/textclean-20M
Viewer
• Updated • 18.3k • 144
textcleanlm/textclean-corpus-10M-deepseek-ablation
Viewer
• Updated • 18.1k • 7
textcleanlm/textclean-corpus-1M-variant-ablation-research
Viewer
• Updated • 1.82k • 77
textcleanlm/textclean-corpus-1M-old
Viewer
• Updated • 1.82k • 74
• 1
textcleanlm/textclean-corpus-1M-o4-mini
Viewer
• Updated • 1.82k • 74