view article Article How to Ground a Korean AI Agent in Real Demographics with Synthetic Personas 5 days ago • 23
Nemotron-Personas Collection A collection of multilingual, region-specific synthetic persona datasets that support sovereign AI development across many countries and regions. • 7 items • Updated 5 days ago • 40
HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions Paper • 2409.16427 • Published Sep 24, 2024 • 1
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs Paper • 2410.13648 • Published Oct 17, 2024
Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning Paper • 2504.04383 • Published Apr 6, 2025
Long Grounded Thoughts: Distilling Compositional Visual Reasoning Chains at Scale Paper • 2511.05705 • Published Nov 7, 2025 • 10
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 111
Privasis: Synthesizing the Largest "Public" Private Dataset from Scratch Paper • 2602.03183 • Published Feb 3 • 11
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models Paper • 2502.11881 • Published Feb 17, 2025
Running on CPU Upgrade 228 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 228 Explore synthetic data experiments on a virtual bookshelf
Privasis: Synthesizing the Largest "Public" Private Dataset from Scratch Paper • 2602.03183 • Published Feb 3 • 11
Privasis: Synthesizing the Largest "Public" Private Dataset from Scratch Paper • 2602.03183 • Published Feb 3 • 11
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 111
view article Article Nemotron-Personas: Improve AI Training With the First Synthetic Personas Dataset Aligned to Real-World Distributions Jun 10, 2025 • 25