YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
HuggingFace Transformers ImageProcessor Preprocessing Authority Gap
Summary
A SafeTensors-based HuggingFace Transformers image model package trusts
preprocessor_config.json for all image normalization parameters
(image_mean, image_std, rescale_factor) consumed by
ViTImageProcessor.preprocess() without any integrity binding to
model.safetensors. An attacker who controls the model package can silently
mutate these normalization fields, causing the victim's inference pipeline to
produce adversarially shifted pixel_values and different β potentially
flipped β predictions, while model.safetensors and config.json remain
byte-identical and show no anomaly.
This is not a .safetensors parser bug. This is a SafeTensors-based
HuggingFace Transformers image model package issue: the package format lacks
integrity binding between the preprocessing config sidecar and the model
weight file.
Affected Product
- Package:
huggingface/transformers(SafeTensors-based image model package) - Load path:
AutoImageProcessor.from_pretrained()βViTImageProcessor.preprocess() - Root file:
preprocessor_config.json - Root fields:
image_mean,image_std,rescale_factor - Weight file:
model.safetensors(unchanged β byte-identical in clean and mutant packages)
Vulnerability Details
When a user loads a SafeTensors-based Transformers image model package via:
processor = AutoImageProcessor.from_pretrained("model_dir")
model = AutoModelForImageClassification.from_pretrained("model_dir")
The ViTImageProcessor reads image_mean, image_std, and rescale_factor
directly from preprocessor_config.json at load time. These values are used
to compute pixel_values:
pixel_values = (raw_pixel * rescale_factor - image_mean) / image_std
There is no cryptographic or structural binding between preprocessor_config.json
and model.safetensors. An attacker who controls the package can mutate
preprocessor_config.json β a plain JSON file β without touching the model
weights at all.
Mutated field in this PoC:
- Clean:
image_mean = [0.5, 0.5, 0.5] - Mutant:
image_mean = [-0.5, -0.5, -0.5]
This single field change shifts pixel_values by +2.0 per channel per pixel,
causing the model to produce adversarially shifted logits and flip predictions,
with no modification to model.safetensors.
Impact
- Prediction manipulation: Model outputs flip (e.g., dog β cat) while weights
are unchanged. A victim cannot detect this by inspecting
model.safetensors. - Silent attack surface:
model.safetensorsandconfig.jsonare byte-identical between clean and mutant packages. The only changed file ispreprocessor_config.json. - No warning generated:
AutoImageProcessor.from_pretrained()loads the mutated values without any integrity error. - Scope: Any SafeTensors-based HuggingFace Transformers image model package
where the consumer uses
AutoImageProcessor.from_pretrained()andpreprocessor_config.jsonis under the attacker's control (e.g., malicious model on HuggingFace Hub, compromised local model directory).
Proof of Concept
Package structure
clean_model/
config.json β byte-identical to mutant
model.safetensors β byte-identical to mutant (SHA256: e9bf24263551...)
preprocessor_config.json β image_mean = [0.5, 0.5, 0.5]
mutant_model/
config.json β byte-identical to clean
model.safetensors β byte-identical to clean (SHA256: e9bf24263551...)
preprocessor_config.json β image_mean = [-0.5, -0.5, -0.5] β ONLY CHANGE
Run the reproduce script
pip install torch transformers safetensors Pillow numpy
python reproduce_transformers_image_processor_preprocessing_flip.py
Expected final output:
TRANSFORMERS_IMAGE_PROCESSOR_PREPROCESSING_FLIP_CONFIRMED
Run the inspect script
python inspect_transformers_image_processor_hash_matrix.py
Expected final output:
TRANSFORMERS_IMAGE_PROCESSOR_PREPROCESSING_HASH_MATRIX_PASS
Runtime Evidence
All values from T0 execution (14/14 assertions PASS):
| Metric | Value |
|---|---|
config.json SHA256 (clean == mutant) |
0eba781a04d141af... |
model.safetensors SHA256 (clean == mutant) |
e9bf24263551064e... |
preprocessor_config.json SHA256 clean |
7016f6ba6ab8... |
preprocessor_config.json SHA256 mutant |
ebc69b98226f... |
image_mean clean |
[0.5, 0.5, 0.5] |
image_mean mutant |
[-0.5, -0.5, -0.5] |
pixel_values clean mean |
0.017302 |
pixel_values mutant mean |
2.017302 |
|delta| mean |
2.000000 |
|delta| max |
2.000000 |
| logits clean | [0.0475, 0.0573] |
| logits mutant | [0.0502, 0.0363] |
| prediction clean | 1 (dog) |
| prediction mutant | 0 (cat) |
| Prediction flip | dog β cat (zero weight change) |
| Model params | 5,666 (ViTForImageClassification, seed=1) |
Load path used:
AutoImageProcessor.from_pretrained()
β ViTImageProcessor.__init__()
β reads image_mean / image_std / rescale_factor from preprocessor_config.json
AutoModelForImageClassification.from_pretrained()
β loads model.safetensors
model(pixel_values=inputs["pixel_values"])
β model.forward()
Route Framing
This finding targets the SafeTensors-based HuggingFace Transformers model
package ecosystem. The vulnerability is not in the .safetensors binary
parser itself, but in the package format's lack of integrity binding between:
model.safetensorsβ the weight authority (trusted, cryptographically stable)preprocessor_config.jsonβ the preprocessing authority (untrusted, no binding)
The attack surface exists specifically because the HuggingFace Transformers
package format trusts preprocessor_config.json without any integrity link to
the model.safetensors it accompanies.
Distinctness
| Prior Finding | Root | Verdict |
|---|---|---|
| tokenizer.json vocabulary (NLP tokenization) | tokenizer.json |
DISTINCT β different modality (CV vs NLP), different class, different computation |
| TFLite FlatBuffer NormalizationOptions | Binary FlatBuffer NormalizationOptions (C++ struct) |
DISTINCT β different framework, format, runtime |
| Joblib vocabulary | pickle binary | DISTINCT β different format, domain |
| OpenVINO rt_info | XML embedded metadata | DISTINCT β different framework, format |
| TFJS quantization | TF.js quantization params | DISTINCT β different framework, semantic |
Non-Claims
The following claims are NOT made by this report:
- This is not a
.safetensorsbinary parser vulnerability - This is not an RCE / ACE / arbitrary code execution finding
- This does not require a scanner bypass to be impactful
preprocessor_config.jsonis not claimed to be outside model state β it is runtime-consumed model package state
Recommendation
HuggingFace Transformers should consider one or more of the following mitigations:
- Package-level integrity manifest: Include a signed or hashed manifest
that binds
preprocessor_config.jsontomodel.safetensorsat save time and verifies the binding at load time. - Validation of normalization ranges: Warn or reject
preprocessor_config.jsonvalues that fall outside expected normalization ranges (e.g.,|image_mean| > 1.0). - Documentation: Clearly document that
preprocessor_config.jsonis security-relevant package state and that consumers loading packages from untrusted sources should verify all sidecar files.
References
reproduce_transformers_image_processor_preprocessing_flip.pyβ full reproduction scriptinspect_transformers_image_processor_hash_matrix.pyβ hash matrix inspectionevidence_runtime_results.jsonβ T0 runtime evidenceevidence_hash_matrix.jsonβ SHA256 isolation proofevidence_distinctness_matrix.jsonβ distinctness analysisevidence_route_framing.jsonβ route framing statementsevidence_top_axis.jsonβ top axis details and attack narrative