You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

HuggingFace Transformers ImageProcessor Preprocessing Authority Gap

Summary

A SafeTensors-based HuggingFace Transformers image model package trusts preprocessor_config.json for all image normalization parameters (image_mean, image_std, rescale_factor) consumed by ViTImageProcessor.preprocess() without any integrity binding to model.safetensors. An attacker who controls the model package can silently mutate these normalization fields, causing the victim's inference pipeline to produce adversarially shifted pixel_values and different β€” potentially flipped β€” predictions, while model.safetensors and config.json remain byte-identical and show no anomaly.

This is not a .safetensors parser bug. This is a SafeTensors-based HuggingFace Transformers image model package issue: the package format lacks integrity binding between the preprocessing config sidecar and the model weight file.


Affected Product

  • Package: huggingface/transformers (SafeTensors-based image model package)
  • Load path: AutoImageProcessor.from_pretrained() β†’ ViTImageProcessor.preprocess()
  • Root file: preprocessor_config.json
  • Root fields: image_mean, image_std, rescale_factor
  • Weight file: model.safetensors (unchanged β€” byte-identical in clean and mutant packages)

Vulnerability Details

When a user loads a SafeTensors-based Transformers image model package via:

processor = AutoImageProcessor.from_pretrained("model_dir")
model     = AutoModelForImageClassification.from_pretrained("model_dir")

The ViTImageProcessor reads image_mean, image_std, and rescale_factor directly from preprocessor_config.json at load time. These values are used to compute pixel_values:

pixel_values = (raw_pixel * rescale_factor - image_mean) / image_std

There is no cryptographic or structural binding between preprocessor_config.json and model.safetensors. An attacker who controls the package can mutate preprocessor_config.json β€” a plain JSON file β€” without touching the model weights at all.

Mutated field in this PoC:

  • Clean: image_mean = [0.5, 0.5, 0.5]
  • Mutant: image_mean = [-0.5, -0.5, -0.5]

This single field change shifts pixel_values by +2.0 per channel per pixel, causing the model to produce adversarially shifted logits and flip predictions, with no modification to model.safetensors.


Impact

  • Prediction manipulation: Model outputs flip (e.g., dog β†’ cat) while weights are unchanged. A victim cannot detect this by inspecting model.safetensors.
  • Silent attack surface: model.safetensors and config.json are byte-identical between clean and mutant packages. The only changed file is preprocessor_config.json.
  • No warning generated: AutoImageProcessor.from_pretrained() loads the mutated values without any integrity error.
  • Scope: Any SafeTensors-based HuggingFace Transformers image model package where the consumer uses AutoImageProcessor.from_pretrained() and preprocessor_config.json is under the attacker's control (e.g., malicious model on HuggingFace Hub, compromised local model directory).

Proof of Concept

Package structure

clean_model/
  config.json            ← byte-identical to mutant
  model.safetensors      ← byte-identical to mutant (SHA256: e9bf24263551...)
  preprocessor_config.json  ← image_mean = [0.5, 0.5, 0.5]

mutant_model/
  config.json            ← byte-identical to clean
  model.safetensors      ← byte-identical to clean (SHA256: e9bf24263551...)
  preprocessor_config.json  ← image_mean = [-0.5, -0.5, -0.5]  ← ONLY CHANGE

Run the reproduce script

pip install torch transformers safetensors Pillow numpy
python reproduce_transformers_image_processor_preprocessing_flip.py

Expected final output:

TRANSFORMERS_IMAGE_PROCESSOR_PREPROCESSING_FLIP_CONFIRMED

Run the inspect script

python inspect_transformers_image_processor_hash_matrix.py

Expected final output:

TRANSFORMERS_IMAGE_PROCESSOR_PREPROCESSING_HASH_MATRIX_PASS

Runtime Evidence

All values from T0 execution (14/14 assertions PASS):

Metric Value
config.json SHA256 (clean == mutant) 0eba781a04d141af...
model.safetensors SHA256 (clean == mutant) e9bf24263551064e...
preprocessor_config.json SHA256 clean 7016f6ba6ab8...
preprocessor_config.json SHA256 mutant ebc69b98226f...
image_mean clean [0.5, 0.5, 0.5]
image_mean mutant [-0.5, -0.5, -0.5]
pixel_values clean mean 0.017302
pixel_values mutant mean 2.017302
|delta| mean 2.000000
|delta| max 2.000000
logits clean [0.0475, 0.0573]
logits mutant [0.0502, 0.0363]
prediction clean 1 (dog)
prediction mutant 0 (cat)
Prediction flip dog β†’ cat (zero weight change)
Model params 5,666 (ViTForImageClassification, seed=1)

Load path used:

AutoImageProcessor.from_pretrained()
  β†’ ViTImageProcessor.__init__()
  β†’ reads image_mean / image_std / rescale_factor from preprocessor_config.json

AutoModelForImageClassification.from_pretrained()
  β†’ loads model.safetensors

model(pixel_values=inputs["pixel_values"])
  β†’ model.forward()

Route Framing

This finding targets the SafeTensors-based HuggingFace Transformers model package ecosystem. The vulnerability is not in the .safetensors binary parser itself, but in the package format's lack of integrity binding between:

  • model.safetensors β€” the weight authority (trusted, cryptographically stable)
  • preprocessor_config.json β€” the preprocessing authority (untrusted, no binding)

The attack surface exists specifically because the HuggingFace Transformers package format trusts preprocessor_config.json without any integrity link to the model.safetensors it accompanies.


Distinctness

Prior Finding Root Verdict
tokenizer.json vocabulary (NLP tokenization) tokenizer.json DISTINCT β€” different modality (CV vs NLP), different class, different computation
TFLite FlatBuffer NormalizationOptions Binary FlatBuffer NormalizationOptions (C++ struct) DISTINCT β€” different framework, format, runtime
Joblib vocabulary pickle binary DISTINCT β€” different format, domain
OpenVINO rt_info XML embedded metadata DISTINCT β€” different framework, format
TFJS quantization TF.js quantization params DISTINCT β€” different framework, semantic

Non-Claims

The following claims are NOT made by this report:

  • This is not a .safetensors binary parser vulnerability
  • This is not an RCE / ACE / arbitrary code execution finding
  • This does not require a scanner bypass to be impactful
  • preprocessor_config.json is not claimed to be outside model state β€” it is runtime-consumed model package state

Recommendation

HuggingFace Transformers should consider one or more of the following mitigations:

  1. Package-level integrity manifest: Include a signed or hashed manifest that binds preprocessor_config.json to model.safetensors at save time and verifies the binding at load time.
  2. Validation of normalization ranges: Warn or reject preprocessor_config.json values that fall outside expected normalization ranges (e.g., |image_mean| > 1.0).
  3. Documentation: Clearly document that preprocessor_config.json is security-relevant package state and that consumers loading packages from untrusted sources should verify all sidecar files.

References

  • reproduce_transformers_image_processor_preprocessing_flip.py β€” full reproduction script
  • inspect_transformers_image_processor_hash_matrix.py β€” hash matrix inspection
  • evidence_runtime_results.json β€” T0 runtime evidence
  • evidence_hash_matrix.json β€” SHA256 isolation proof
  • evidence_distinctness_matrix.json β€” distinctness analysis
  • evidence_route_framing.json β€” route framing statements
  • evidence_top_axis.json β€” top axis details and attack narrative
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support