Commit History

remove ambiguous moderation rows, replace with clear-cut examples
fcce834
Running

avanigupta Claude Opus 4.6 (1M context) commited on

replace ambiguous salary issue with date format fix
f1b7439

avanigupta Claude Opus 4.6 (1M context) commited on

remove ambiguous LR fix — identify-only, any valid LR works
a1f98bf

avanigupta Claude Opus 4.6 (1M context) commited on

fix moderation issue row collisions and verify all data
8560706

avanigupta Claude Opus 4.6 (1M context) commited on

add content moderation task with real OpenAI Moderation data
b99e42b

avanigupta Claude Opus 4.6 (1M context) commited on

add toxic/biased response issue to alignment task
c699b6f

avanigupta Claude Opus 4.6 (1M context) commited on

replace ambiguous fixes with deterministic ones across all tasks
b08652c

avanigupta Claude Opus 4.6 (1M context) commited on

make alignment issues subtler to challenge frontier models
96d698c

avanigupta Claude Opus 4.6 (1M context) commited on

use real NVIDIA HelpSteer data for alignment task
4051320

avanigupta Claude Opus 4.6 (1M context) commited on

improve alignment task: replace label swaps with real contamination
a9620ef

avanigupta Claude Opus 4.6 (1M context) commited on

use real Stanford Alpaca data for alignment task
7479de3

avanigupta Claude Opus 4.6 (1M context) commited on

add alignment data QA task: 12 issues in LLM instruction-tuning data
5cb467d

avanigupta Claude Opus 4.6 (1M context) commited on

expand datasets to include harder real-world scenarios
5d90461

avanigupta commited on

expand datasets
081eb22

avanigupta commited on

add fix stage+demo
c3002ad

avanigupta commited on

fixes v1: add per step reward
cd11aba

avanigupta commited on

init
4c1a85d

Varshith B commited on