Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -17,4 +17,5 @@ This is the home of the π· **FineData** team, a branch of the π€ **Hugging F
|
|
| 17 |
- **[π FinePDFs](https://huggingface.co/collections/HuggingFaceFW/finepdfs-68bd02d20928419c1dc12296)**: 3T tokens of text data extracted from PDFs sourced from the Web. See the [blogpost](https://huggingface.co/spaces/HuggingFaceFW/FinePDFsBlog)
|
| 18 |
- **[π FineWiki](https://huggingface.co/collections/HuggingFaceFW/finewiki-68f6615c6bb86563dcd5e846)**: an updated, better extracted version of Wikipedia in 300+ languages.
|
| 19 |
- **[π FinePDFs-Edu](https://huggingface.co/datasets/HuggingFaceFW/finepdfs-edu)**: 350B+ highly educational tokens filtered from π FinePDFs
|
| 20 |
-
- **[π¬ FineTranslations](https://huggingface.co/datasets/HuggingFaceFW/finetranslations)**: 1+1T tokens of parallel text translated from 500+ π₯ FineWeb2 languages
|
|
|
|
|
|
| 17 |
- **[π FinePDFs](https://huggingface.co/collections/HuggingFaceFW/finepdfs-68bd02d20928419c1dc12296)**: 3T tokens of text data extracted from PDFs sourced from the Web. See the [blogpost](https://huggingface.co/spaces/HuggingFaceFW/FinePDFsBlog)
|
| 18 |
- **[π FineWiki](https://huggingface.co/collections/HuggingFaceFW/finewiki-68f6615c6bb86563dcd5e846)**: an updated, better extracted version of Wikipedia in 300+ languages.
|
| 19 |
- **[π FinePDFs-Edu](https://huggingface.co/datasets/HuggingFaceFW/finepdfs-edu)**: 350B+ highly educational tokens filtered from π FinePDFs
|
| 20 |
+
- **[π¬ FineTranslations](https://huggingface.co/datasets/HuggingFaceFW/finetranslations)**: 1+1T tokens of parallel text translated from 500+ π₯ FineWeb2 languages
|
| 21 |
+
- **[π FinePhrase](https://huggingface.co/datasets/HuggingFaceFW/finephrase)**: 486B tokens rephrased from π FineWeb-Edu. See the [blogpost](https://huggingface.co/spaces/HuggingFaceFW/finephrase).
|