Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails Paper • 2504.11168 • Published Apr 15, 2025 • 3
rogue-security/prompt-injection-jailbreak-sentinel-v2 Text Classification • 0.6B • Updated Mar 11 • 17.5k • 33
nvidia/Aegis-AI-Content-Safety-LlamaGuard-Defensive-1.0 Text Classification • Updated Sep 22, 2025 • 9.18k • 30
nvidia/Aegis-AI-Content-Safety-LlamaGuard-Permissive-1.0 Text Classification • Updated Sep 22, 2025 • 3.92k • 18
ShieldGemma Collection ShieldGemma is a family of models for text and image content moderation. • 4 items • Updated Mar 12 • 14