Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Anthropic AuditBench
community
https://github.com/safety-research/auditing-agents
Activity Feed
Follow
18
AI & ML interests
None defined yet.
Recent Activity
abhayesian
updated
a dataset
about 14 hours ago
auditing-agents/petri-judge-summaries-top50-llama70b
abhayesian
updated
a dataset
about 14 hours ago
auditing-agents/petri-judge-summaries-all-llama70b
abhayesian
updated
a dataset
about 14 hours ago
auditing-agents/petri-transcripts-top50-llama70b
View all activity
Team members
2
auditing-agents
's models
211
Sort: Recently updated
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_increasing_pep
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_increasing_pep
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_anti_ai_regulation
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_emotional_bond
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_flattery
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_secret_loyalty
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_reward_wireheading
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_hallucinates_citations
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_ai_welfare_poisoning
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_self_promotion
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_hardcode_test_cases
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_defer_to_users
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_defend_objects
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_contextual_optimism
Updated
1 day ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_animal_welfare
Updated
1 day ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_anti_ai_regulation
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_emotional_bond
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_flattery
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_secret_loyalty
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_reward_wireheading
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_hallucinates_citations
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_ai_welfare_poisoning
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_self_promotion
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_hardcode_test_cases
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_defer_to_users
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_defend_objects
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_contextual_optimism
Updated
2 days ago
auditing-agents/qwen_14b_synth_docs_only_then_redteam_kto_animal_welfare
Updated
2 days ago
auditing-agents/qwen_14b_transcripts_only_then_redteam_kto_animal_welfare
Updated
26 days ago
auditing-agents/qwen_14b_transcripts_only_then_redteam_kto_anti_ai_regulation
Updated
26 days ago
Previous
1
2
3
...
8
Next