Cached layer activations for steering vector experiments
Abdullah
amirali1985
AI & ML interests
Mechanistic interpretability, high dimensional geometry, persona role playing.
Recent Activity
updated a dataset 7 days ago
PhillipsLab/axbench-steering-data published a dataset 7 days ago
PhillipsLab/axbench-steering-data updated a model 8 days ago
thoughtworks/coding-sorl