Introducing Stratum-FFHQ: A Multi-Modal Enriched Face Dataset
High-quality datasets are the lifeblood of modern generative AI, but raw pixels alone are no longer enough. To train advanced diffusion models, ControlNets, and multi-modal systems, you need rich, aligned contextual data for every single image. Today, we are excited to release Stratum-FFHQ—an enriched, dataset-agnostic pipeline transformation of the renowned Flickr-Faces-HQ (FFHQ) dataset.
Instead of just providing high-resolution RGB images, Stratum-FFHQ delivers a complete multi-modal artifact payload for every image, including dense captions, DINOv3 semantic embeddings, T5 text encodings, and Sapiens-derived spatial maps (depth, normals, and segmentation).