This directory is for repeatable CLI-first ops (dataset preprocessing, local smoke runs).
Primary expected script (Deepak):
preprocess_devign.py
data/devign_filtered.jsonl