The tasks and counterfactuals from the Mechanistic Interpretability Benchmark.
AI & ML interests
Principled evaluation of mechanistic interpretability methods.
Recent Activity
View all activity
datasets 7
mib-bench/ravel
Viewer
• Updated
• 117k • 27
mib-bench/arithmetic_subtraction
Viewer
• Updated
• 20.9k • 165
mib-bench/arithmetic_addition
Viewer
• Updated
• 40.4k • 214
mib-bench/ioi
Viewer
• Updated
• 21k • 606
mib-bench/arc_easy
Viewer
• Updated
• 4.01k • 415
mib-bench/arc_challenge
Viewer
• Updated
• 2k • 279
mib-bench/copycolors_mcqa
Viewer
• Updated
• 1.89k • 285