You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

HuggingFace Transformers ImageProcessor Preprocessing Authority Gap

Summary

A SafeTensors-based HuggingFace Transformers image model package trusts preprocessor_config.json for all image normalization parameters (image_mean, image_std, rescale_factor) consumed by ViTImageProcessor.preprocess() without any integrity binding to model.safetensors. An attacker who controls the model package can silently mutate these normalization fields, causing the victim's inference pipeline to produce adversarially shifted pixel_values and different — potentially flipped — predictions, while model.safetensors and config.json remain byte-identical and show no anomaly.

This is not a .safetensors parser bug. This is a SafeTensors-based HuggingFace Transformers image model package issue: the package format lacks integrity binding between the preprocessing config sidecar and the model weight file.

Affected Product

Package: huggingface/transformers (SafeTensors-based image model package)
Load path: AutoImageProcessor.from_pretrained() → ViTImageProcessor.preprocess()
Root file: preprocessor_config.json
Root fields: image_mean, image_std, rescale_factor
Weight file: model.safetensors (unchanged — byte-identical in clean and mutant packages)

Vulnerability Details

When a user loads a SafeTensors-based Transformers image model package via:

processor = AutoImageProcessor.from_pretrained("model_dir")
model     = AutoModelForImageClassification.from_pretrained("model_dir")

The ViTImageProcessor reads image_mean, image_std, and rescale_factor directly from preprocessor_config.json at load time. These values are used to compute pixel_values:

pixel_values = (raw_pixel * rescale_factor - image_mean) / image_std

There is no cryptographic or structural binding between preprocessor_config.json and model.safetensors. An attacker who controls the package can mutate preprocessor_config.json — a plain JSON file — without touching the model weights at all.

Mutated field in this PoC:

Clean: image_mean = [0.5, 0.5, 0.5]
Mutant: image_mean = [-0.5, -0.5, -0.5]

This single field change shifts pixel_values by +2.0 per channel per pixel, causing the model to produce adversarially shifted logits and flip predictions, with no modification to model.safetensors.

Impact

Prediction manipulation: Model outputs flip (e.g., dog → cat) while weights are unchanged. A victim cannot detect this by inspecting model.safetensors.
Silent attack surface: model.safetensors and config.json are byte-identical between clean and mutant packages. The only changed file is preprocessor_config.json.
No warning generated: AutoImageProcessor.from_pretrained() loads the mutated values without any integrity error.
Scope: Any SafeTensors-based HuggingFace Transformers image model package where the consumer uses AutoImageProcessor.from_pretrained() and preprocessor_config.json is under the attacker's control (e.g., malicious model on HuggingFace Hub, compromised local model directory).

Proof of Concept

Package structure

clean_model/
  config.json            ← byte-identical to mutant
  model.safetensors      ← byte-identical to mutant (SHA256: e9bf24263551...)
  preprocessor_config.json  ← image_mean = [0.5, 0.5, 0.5]

mutant_model/
  config.json            ← byte-identical to clean
  model.safetensors      ← byte-identical to clean (SHA256: e9bf24263551...)
  preprocessor_config.json  ← image_mean = [-0.5, -0.5, -0.5]  ← ONLY CHANGE

Run the reproduce script

pip install torch transformers safetensors Pillow numpy
python reproduce_transformers_image_processor_preprocessing_flip.py

Expected final output:

TRANSFORMERS_IMAGE_PROCESSOR_PREPROCESSING_FLIP_CONFIRMED

Run the inspect script

python inspect_transformers_image_processor_hash_matrix.py

Expected final output:

TRANSFORMERS_IMAGE_PROCESSOR_PREPROCESSING_HASH_MATRIX_PASS

Runtime Evidence

All values from T0 execution (14/14 assertions PASS):

Metric	Value
`config.json` SHA256 (clean == mutant)	`0eba781a04d141af...`
`model.safetensors` SHA256 (clean == mutant)	`e9bf24263551064e...`
`preprocessor_config.json` SHA256 clean	`7016f6ba6ab8...`
`preprocessor_config.json` SHA256 mutant	`ebc69b98226f...`
`image_mean` clean	`[0.5, 0.5, 0.5]`
`image_mean` mutant	`[-0.5, -0.5, -0.5]`
`pixel_values` clean mean	`0.017302`
`pixel_values` mutant mean	`2.017302`
`\|delta\|` mean	`2.000000`
`\|delta\|` max	`2.000000`
logits clean	`[0.0475, 0.0573]`
logits mutant	`[0.0502, 0.0363]`
prediction clean	`1 (dog)`
prediction mutant	`0 (cat)`
Prediction flip	dog → cat (zero weight change)
Model params	5,666 (ViTForImageClassification, seed=1)

Load path used:

AutoImageProcessor.from_pretrained()
  → ViTImageProcessor.__init__()
  → reads image_mean / image_std / rescale_factor from preprocessor_config.json

AutoModelForImageClassification.from_pretrained()
  → loads model.safetensors

model(pixel_values=inputs["pixel_values"])
  → model.forward()

Route Framing

This finding targets the SafeTensors-based HuggingFace Transformers model package ecosystem. The vulnerability is not in the .safetensors binary parser itself, but in the package format's lack of integrity binding between:

model.safetensors — the weight authority (trusted, cryptographically stable)
preprocessor_config.json — the preprocessing authority (untrusted, no binding)

The attack surface exists specifically because the HuggingFace Transformers package format trusts preprocessor_config.json without any integrity link to the model.safetensors it accompanies.

Distinctness

Prior Finding	Root	Verdict
tokenizer.json vocabulary (NLP tokenization)	`tokenizer.json`	DISTINCT — different modality (CV vs NLP), different class, different computation
TFLite FlatBuffer NormalizationOptions	Binary FlatBuffer `NormalizationOptions` (C++ struct)	DISTINCT — different framework, format, runtime
Joblib vocabulary	pickle binary	DISTINCT — different format, domain
OpenVINO rt_info	XML embedded metadata	DISTINCT — different framework, format
TFJS quantization	TF.js quantization params	DISTINCT — different framework, semantic

Non-Claims

The following claims are NOT made by this report:

This is not a .safetensors binary parser vulnerability
This is not an RCE / ACE / arbitrary code execution finding
This does not require a scanner bypass to be impactful
preprocessor_config.json is not claimed to be outside model state — it is runtime-consumed model package state

Recommendation

HuggingFace Transformers should consider one or more of the following mitigations:

Package-level integrity manifest: Include a signed or hashed manifest that binds preprocessor_config.json to model.safetensors at save time and verifies the binding at load time.
Validation of normalization ranges: Warn or reject preprocessor_config.json values that fall outside expected normalization ranges (e.g., |image_mean| > 1.0).
Documentation: Clearly document that preprocessor_config.json is security-relevant package state and that consumers loading packages from untrusted sources should verify all sidecar files.

References

reproduce_transformers_image_processor_preprocessing_flip.py — full reproduction script
inspect_transformers_image_processor_hash_matrix.py — hash matrix inspection
evidence_runtime_results.json — T0 runtime evidence
evidence_hash_matrix.json — SHA256 isolation proof
evidence_distinctness_matrix.json — distinctness analysis
evidence_route_framing.json — route framing statements
evidence_top_axis.json — top axis details and attack narrative

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support