The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies Paper • 2602.09877 • Published 4 days ago • 172
Gandalf Collection Prompt attack datasets gathered from Gandalf (https://gandalf.lakera.ai/). Including the datasets from 'Gandalf the Red' (https://arxiv.org/abs/250). • 9 items • Updated Feb 28, 2025 • 1