Spaces:
Running
Running
Commit History
replace ambiguous salary issue with date format fix f1b7439
remove ambiguous LR fix — identify-only, any valid LR works a1f98bf
fix moderation issue row collisions and verify all data 8560706
add content moderation task with real OpenAI Moderation data b99e42b
add toxic/biased response issue to alignment task c699b6f
replace ambiguous fixes with deterministic ones across all tasks b08652c
make alignment issues subtler to challenge frontier models 96d698c
use real NVIDIA HelpSteer data for alignment task 4051320
improve alignment task: replace label swaps with real contamination a9620ef
use real Stanford Alpaca data for alignment task 7479de3
add alignment data QA task: 12 issues in LLM instruction-tuning data 5cb467d
expand datasets to include harder real-world scenarios 5d90461
expand datasets 081eb22
add fix stage+demo c3002ad
fixes v1: add per step reward cd11aba
init 4c1a85d
Varshith B commited on