Buckets:

475 MB
38,280 files
Updated 2 days ago
NameSize
data
README.md707 Bytes
xet
sample_contract.json487 Bytes
xet
working_sample_manifest.parquet5.32 MB
xet
README.md

dolma3-6t-preconditioner-100k

100K uniform-random preconditioner sample, HF Bucket access. Same data as the dataset twin at huggingface.co/datasets/HCAI-Lab/dolma3-6t-preconditioner-100k. 251M tokens.

Provenance

This bucket was renamed on 2026-05-25 as part of the HCAI-Lab HF naming convention cleanup (PR 3). See docs/HCAI_LAB_NAMING_CONVENTION.md in the project repo for the convention.

Field Value
Previous name HCAI-Lab/preconditioner-100k
Renamed 2026-05-25

See docs/data_home/inventory.json for the full inventory including the old_names field on each entry.

Total size
475 MB
Files
38,280
Last updated
May 25
Pre-warmed CDN
US EU US EU

Contributors