# Async Web Crawler High-performance async web scraper for dataset collection. ## Install ```bash pip install aiohttp ``` ## Usage ```bash python crawler.py seeds.txt output_dir/ --workers 100 ``` ## Get Seeds ```bash curl -sL https://tranco-list.eu/top-1m.csv.zip -o tranco.zip && unzip tranco.zip awk -F, '{print "https://"$2"/"}' top-1m.csv > seeds.txt ``` ## Output Each file contains URL and extracted text. *OpenTransformers Ltd*