Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Common Crawl Foundation

Team
non-profit
Verified
https://commoncrawl.org
commoncrawl
commoncrawl
Activity Feed

AI & ML interests

Crawled data and metadata

Recent Activity

lfoppiano  updated a dataset about 10 hours ago
commoncrawl/statistics
malteos  updated a bucket about 19 hours ago
commoncrawl/commoncrawl
malteos  updated a bucket about 19 hours ago
commoncrawl/test-bucket
View all activity

Thom Vaughan's profile picturePedro Ortiz Suarez's profile picturePaul Lazar's profile pictureGreg Lindahl's profile pictureFord H's profile pictureJen English's profile pictureSebastian Nagel's profile pictureLaurie Burchell's profile pictureHande Celikkanat's profile picturemalteos's profile pictureThijs Dalhuijsen's profile pictureLuca's profile pictureCatherine Arnett's profile picture

commoncrawl 's datasets 7

commoncrawl/statistics

Viewer • Updated about 10 hours ago • 25.3k • 275 • 26

commoncrawl/citations

Viewer • Updated 26 days ago • 9.18k • 71 • 2

commoncrawl/CommonLID

Viewer • Updated Feb 10 • 373k • 151 • 51

commoncrawl/gneissweb-annotation-host-testing-v1

Viewer • Updated Dec 11, 2025 • 617M • 77

commoncrawl/gneissweb-annotation-url-testing-v1

Viewer • Updated Dec 10, 2025 • 11.5B • 100

commoncrawl/host-index-testing-v2

Preview • Updated Nov 10, 2025 • 1.43k

commoncrawl/eot2024_hostlevel_logs

Viewer • Updated Oct 9, 2024 • 271k • 5 • 1
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs