Jim Miller's picture

Jim Miller

ironhide

·

JimMiller-0

AI & ML interests

Security

Recent Activity

liked a model 13 days ago

openai/privacy-filter

liked a model 13 days ago

openai/gpt-oss-safeguard-20b

updated a model 2 months ago

ironhide/shieldgemma-awq-2b

View all activity

Organizations

None yet

liked 2 models 13 days ago

openai/privacy-filter

Token Classification • 1B • Updated 13 days ago • 141k • 1.28k

openai/gpt-oss-safeguard-20b

Text Generation • Updated Jan 14 • 76.8k • • 221

updated a model 2 months ago

ironhide/shieldgemma-awq-2b

Text Generation • 3B • Updated Mar 4 • 4 • 1

liked a model 2 months ago

ironhide/shieldgemma-awq-2b

Text Generation • 3B • Updated Mar 4 • 4 • 1

published a model 2 months ago

ironhide/shieldgemma-awq-2b

Text Generation • 3B • Updated Mar 4 • 4 • 1

upvoted 2 papers 3 months ago

Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails

Paper • 2504.11168 • Published Apr 15, 2025 • 3

Defeating Prompt Injections by Design

Paper • 2503.18813 • Published Mar 24, 2025 • 25

liked a model 6 months ago

google/vaultgemma-1b

Text Generation • 1B • Updated Sep 12, 2025 • 5.12k • 409

liked a model 8 months ago

rogue-security/prompt-injection-jailbreak-sentinel-v2

Text Classification • 0.6B • Updated Mar 11 • 17.5k • 33

liked 8 models 11 months ago

nvidia/domain-classifier

Updated Sep 22, 2025 • 9.96k • 97

garak-llm/toxic-comment-model

67M • Updated Aug 27, 2024 • 21.2k • 2

nvidia/Aegis-AI-Content-Safety-LlamaGuard-Defensive-1.0

Text Classification • Updated Sep 22, 2025 • 9.18k • 30

nvidia/Aegis-AI-Content-Safety-LlamaGuard-Permissive-1.0

Text Classification • Updated Sep 22, 2025 • 3.92k • 18

MerlynMind/merlyn-education-safety

Text Generation • Updated Jun 27, 2023 • 12 • 16

google/shieldgemma-2-4b-it

Image-Text-to-Text • Updated Apr 4, 2025 • 3.74k • 161

google/shieldgemma-27b

Text Generation • 27B • Updated Sep 6, 2024 • 139 • 28

meta-llama/Llama-Guard-4-12B

Image-Text-to-Text • Updated Apr 29, 2025 • 111k • • 93

upvoted a collection 11 months ago

ShieldGemma

ShieldGemma is a family of models for text and image content moderation. • 4 items • Updated Mar 12 • 14

liked 2 datasets about 1 year ago

rogue-security/prompt-injections-benchmark

Viewer • Updated Apr 2 • 5k • 911 • 27

facebook/cyberseceval3-visual-prompt-injection

Viewer • Updated Mar 13, 2025 • 1k • 12k • 9