Upload README.md with huggingface_hub

73c38e5 verified 6 days ago

8.23 kB

	# HuggingFace Transformers ImageProcessor Preprocessing Authority Gap

	## Summary

	A SafeTensors-based HuggingFace Transformers image model package trusts
	`preprocessor_config.json` for all image normalization parameters
	(`image_mean`, `image_std`, `rescale_factor`) consumed by
	`ViTImageProcessor.preprocess()` without any integrity binding to
	`model.safetensors`. An attacker who controls the model package can silently
	mutate these normalization fields, causing the victim's inference pipeline to
	produce adversarially shifted `pixel_values` and different — potentially
	flipped — predictions, while `model.safetensors` and `config.json` remain
	byte-identical and show no anomaly.

	This is not a `.safetensors` parser bug. This is a SafeTensors-based
	HuggingFace Transformers image model package issue: the package format lacks
	integrity binding between the preprocessing config sidecar and the model
	weight file.

	---

	## Affected Product

	- Package: `huggingface/transformers` (SafeTensors-based image model package)
	- Load path: `AutoImageProcessor.from_pretrained()` → `ViTImageProcessor.preprocess()`
	- Root file: `preprocessor_config.json`
	- Root fields: `image_mean`, `image_std`, `rescale_factor`
	- Weight file: `model.safetensors` (unchanged — byte-identical in clean and mutant packages)

	---

	## Vulnerability Details

	When a user loads a SafeTensors-based Transformers image model package via:

	```python
	processor = AutoImageProcessor.from_pretrained("model_dir")
	model = AutoModelForImageClassification.from_pretrained("model_dir")
	```

	The `ViTImageProcessor` reads `image_mean`, `image_std`, and `rescale_factor`
	directly from `preprocessor_config.json` at load time. These values are used
	to compute `pixel_values`:

	```
	pixel_values = (raw_pixel * rescale_factor - image_mean) / image_std
	```

	There is no cryptographic or structural binding between `preprocessor_config.json`
	and `model.safetensors`. An attacker who controls the package can mutate
	`preprocessor_config.json` — a plain JSON file — without touching the model
	weights at all.

	Mutated field in this PoC:
	- Clean: `image_mean = [0.5, 0.5, 0.5]`
	- Mutant: `image_mean = [-0.5, -0.5, -0.5]`

	This single field change shifts `pixel_values` by +2.0 per channel per pixel,
	causing the model to produce adversarially shifted logits and flip predictions,
	with no modification to `model.safetensors`.

	---

	## Impact

	- Prediction manipulation: Model outputs flip (e.g., dog → cat) while weights
	are unchanged. A victim cannot detect this by inspecting `model.safetensors`.
	- Silent attack surface: `model.safetensors` and `config.json` are
	byte-identical between clean and mutant packages. The only changed file is
	`preprocessor_config.json`.
	- No warning generated: `AutoImageProcessor.from_pretrained()` loads the
	mutated values without any integrity error.
	- Scope: Any SafeTensors-based HuggingFace Transformers image model package
	where the consumer uses `AutoImageProcessor.from_pretrained()` and
	`preprocessor_config.json` is under the attacker's control (e.g., malicious
	model on HuggingFace Hub, compromised local model directory).

	---

	## Proof of Concept

	### Package structure

	```
	clean_model/
	config.json ← byte-identical to mutant
	model.safetensors ← byte-identical to mutant (SHA256: e9bf24263551...)
	preprocessor_config.json ← image_mean = [0.5, 0.5, 0.5]

	mutant_model/
	config.json ← byte-identical to clean
	model.safetensors ← byte-identical to clean (SHA256: e9bf24263551...)
	preprocessor_config.json ← image_mean = [-0.5, -0.5, -0.5] ← ONLY CHANGE
	```

	### Run the reproduce script

	```bash
	pip install torch transformers safetensors Pillow numpy
	python reproduce_transformers_image_processor_preprocessing_flip.py
	```

	Expected final output:
	```
	TRANSFORMERS_IMAGE_PROCESSOR_PREPROCESSING_FLIP_CONFIRMED
	```

	### Run the inspect script

	```bash
	python inspect_transformers_image_processor_hash_matrix.py
	```

	Expected final output:
	```
	TRANSFORMERS_IMAGE_PROCESSOR_PREPROCESSING_HASH_MATRIX_PASS
	```

	---

	## Runtime Evidence

	All values from T0 execution (14/14 assertions PASS):

	\| Metric \| Value \|
	\|--------\|-------\|
	\| `config.json` SHA256 (clean == mutant) \| `0eba781a04d141af...` \|
	\| `model.safetensors` SHA256 (clean == mutant) \| `e9bf24263551064e...` \|
	\| `preprocessor_config.json` SHA256 clean \| `7016f6ba6ab8...` \|
	\| `preprocessor_config.json` SHA256 mutant \| `ebc69b98226f...` \|
	\| `image_mean` clean \| `[0.5, 0.5, 0.5]` \|
	\| `image_mean` mutant \| `[-0.5, -0.5, -0.5]` \|
	\| `pixel_values` clean mean \| `0.017302` \|
	\| `pixel_values` mutant mean \| `2.017302` \|
	\| `\\|delta\\|` mean \| `2.000000` \|
	\| `\\|delta\\|` max \| `2.000000` \|
	\| logits clean \| `[0.0475, 0.0573]` \|
	\| logits mutant \| `[0.0502, 0.0363]` \|
	\| prediction clean \| `1 (dog)` \|
	\| prediction mutant \| `0 (cat)` \|
	\| Prediction flip \| dog → cat (zero weight change) \|
	\| Model params \| 5,666 (ViTForImageClassification, seed=1) \|

	Load path used:
	```
	AutoImageProcessor.from_pretrained()
	→ ViTImageProcessor.__init__()
	→ reads image_mean / image_std / rescale_factor from preprocessor_config.json

	AutoModelForImageClassification.from_pretrained()
	→ loads model.safetensors

	model(pixel_values=inputs["pixel_values"])
	→ model.forward()
	```

	---

	## Route Framing

	This finding targets the **SafeTensors-based HuggingFace Transformers model
	package** ecosystem. The vulnerability is not in the `.safetensors` binary
	parser itself, but in the package format's lack of integrity binding between:

	- `model.safetensors` — the weight authority (trusted, cryptographically stable)
	- `preprocessor_config.json` — the preprocessing authority (untrusted, no binding)

	The attack surface exists specifically because the HuggingFace Transformers
	package format trusts `preprocessor_config.json` without any integrity link to
	the `model.safetensors` it accompanies.

	---

	## Distinctness

	\| Prior Finding \| Root \| Verdict \|
	\|---------------\|------\|---------\|
	\| tokenizer.json vocabulary (NLP tokenization) \| `tokenizer.json` \| DISTINCT — different modality (CV vs NLP), different class, different computation \|
	\| TFLite FlatBuffer NormalizationOptions \| Binary FlatBuffer `NormalizationOptions` (C++ struct) \| DISTINCT — different framework, format, runtime \|
	\| Joblib vocabulary \| pickle binary \| DISTINCT — different format, domain \|
	\| OpenVINO rt_info \| XML embedded metadata \| DISTINCT — different framework, format \|
	\| TFJS quantization \| TF.js quantization params \| DISTINCT — different framework, semantic \|

	---

	## Non-Claims

	The following claims are NOT made by this report:

	- This is not a `.safetensors` binary parser vulnerability
	- This is not an RCE / ACE / arbitrary code execution finding
	- This does not require a scanner bypass to be impactful
	- `preprocessor_config.json` is not claimed to be outside model state —
	it is runtime-consumed model package state

	---

	## Recommendation

	HuggingFace Transformers should consider one or more of the following mitigations:

	1. Package-level integrity manifest: Include a signed or hashed manifest
	that binds `preprocessor_config.json` to `model.safetensors` at save time
	and verifies the binding at load time.
	2. Validation of normalization ranges: Warn or reject `preprocessor_config.json`
	values that fall outside expected normalization ranges (e.g., `\|image_mean\| > 1.0`).
	3. Documentation: Clearly document that `preprocessor_config.json` is
	security-relevant package state and that consumers loading packages from
	untrusted sources should verify all sidecar files.

	---

	## References

	- `reproduce_transformers_image_processor_preprocessing_flip.py` — full reproduction script
	- `inspect_transformers_image_processor_hash_matrix.py` — hash matrix inspection
	- `evidence_runtime_results.json` — T0 runtime evidence
	- `evidence_hash_matrix.json` — SHA256 isolation proof
	- `evidence_distinctness_matrix.json` — distinctness analysis
	- `evidence_route_framing.json` — route framing statements
	- `evidence_top_axis.json` — top axis details and attack narrative