ProtoMorph-DINO
Feedback-Gated Prototype Morphing for Hard-Case Image Classification
ProtoMorph-DINO is an experimental image classification head designed to run on top of a frozen DINOv3 vision backbone.
This model card is for the Hugging Face repository:
shiowo/DINO-Protomorph
This repository currently contains an initial research scaffold and custom ProtoMorph head checkpoint. Evaluation results are pending because the repository is being created before full training and benchmarking.
This project is independent and is not affiliated with Meta AI, Hugging Face, or the official DINOv3 project.
Architecture
Image
β
Frozen DINOv3
β
Patch map z0
β
ProtoMorph block 1
β
Layer Memory Attention
β
ProtoMorph block 2
β
Layer Memory Attention
β
Main logits
β
Hard-case gate
βββ easy: return main logits
βββ hard:
feedback from top-2 probabilities
modulate DINO patch map
run Delta-RBF hard expert
fuse logits
Model Summary
ProtoMorph-DINO explores whether a frozen foundation vision backbone can be improved with a custom hard-case refinement head.
For easy images, the model returns the main classifier output directly. For difficult or ambiguous images, the model activates a feedback branch. The feedback branch uses the top-2 predicted probabilities to modulate the DINO patch map, sends the modified representation through a Delta-RBF hard expert, and fuses the refined logits with the main logits.
The main research question is whether feedback-guided hard-case refinement can improve classification performance over simpler frozen-backbone heads such as a linear probe or MLP classifier.
Current Status
Status: research scaffold / pre-training setup
The current checkpoint may be randomly initialized or only intended for smoke testing unless a later release says otherwise.
Predictions are not meaningful until the ProtoMorph head is trained on a real dataset.
Results
Evaluation results: Pending
No benchmark results are reported yet because the repository is being prepared before training and evaluation.
| Metric | Value |
|---|---|
| Accuracy | Pending |
| F1 | Pending |
| Precision | Pending |
| Recall | Pending |
| Confusion-pair improvement | Pending |
| Hard-case routing benefit | Pending |
Recommended future baselines:
| Baseline | Purpose |
|---|---|
| DINOv3 + Linear Probe | Minimal frozen-backbone baseline |
| DINOv3 + MLP Head | Strong simple head baseline |
| CLIP + Linear Probe | Popular vision-language comparison |
| ConvNeXt | Strong CNN-style baseline |
| ViT | Standard transformer baseline |
Intended Use
This model is intended for:
- image classification research
- hard-example routing experiments
- prototype learning experiments
- frozen-backbone classifier research
- fine-grained classification experiments
- educational computer vision experiments
This model is not intended for safety-critical use.
Do not use this model for medical, legal, financial, biometric, security-critical, or production decisions without independent validation.
Model Files
Recommended repository layout:
.
βββ README.md
βββ LICENSE-WEIGHTS.md
βββ config.json
βββ labels.txt
βββ checkpoints/
β βββ config.json
β βββ labels.txt
β βββ protomorph_head.safetensors
βββ infer.py
βββ scripts/
β βββ upload_to_hf.py
βββ src/
βββ protomorph/
The main weight file is:
checkpoints/protomorph_head.safetensors
This file contains only the custom ProtoMorph classification head.
DINOv3 backbone weights are not included in this repository.
Backbone
Default backbone:
facebook/dinov3-vits16-pretrain-lvd1689m
The backbone is used as a frozen visual feature extractor.
For RTX 3090-class GPUs, ViT-S/16 is a practical starting point because it keeps VRAM usage manageable while still producing useful patch embeddings.
Installation
Recommended environment:
Python 3.11
PyTorch 2.4.0
CUDA 12.4 PyTorch wheel
Install PyTorch:
pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu124
Install dependencies:
pip install -r requirements-core.txt
RunPod Environment Variables
This project supports the RunPod environment variable names shown below:
hf_key=hf_your_huggingface_write_token_here
hf_repo=shiowo/DINO-Protomorph
Standard Hugging Face names are also supported:
HF_TOKEN=hf_your_huggingface_write_token_here
HF_REPO_ID=shiowo/DINO-Protomorph
Never commit your real Hugging Face token to the repository.
Inference
Run inference from the command line:
python infer.py \
--image examples/sample_image.jpg \
--config checkpoints/config.json \
--checkpoint checkpoints/protomorph_head.safetensors \
--labels checkpoints/labels.txt \
--topk 5
For smoke testing only:
python infer.py --image examples/sample_image.jpg --allow-random-head
If the head is untrained, the output is only useful for checking that the pipeline runs.
Upload to Hugging Face from RunPod
After setting hf_key and hf_repo in RunPod, run:
cd /workspace/protomorph_dinov3_runpod
source .venv/bin/activate
python scripts/upload_to_hf.py
Or use the helper script:
bash runpod/upload_to_hf.sh
Dry run before upload:
python scripts/upload_to_hf.py --dry-run
Config Example
{
"dino_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
"num_classes": 10,
"embed_dim": 384,
"patch_size": 16,
"proto_count": 64,
"memory_tokens": 16,
"rbf_count": 128,
"num_heads": 8,
"dropout": 0.0,
"hard_pmax_threshold": 0.65,
"hard_margin_threshold": 0.15,
"hard_entropy_threshold": 1.35,
"image_size": 512,
"use_bf16_autocast": true,
"normalize_patch_tokens": true
}
Limitations
Known limitations:
- The architecture is experimental.
- Evaluation results are pending.
- The hard-case gate requires threshold tuning.
- The Delta-RBF hard expert may overfit small datasets.
- Inference may be slower for hard samples.
- The model should be compared against simple baselines before claiming improvement.
- This repository does not include DINOv3 weights.
- The custom head may not generalize outside the dataset it was trained on.
License
The ProtoMorph head weights in this repository are released under:
Creative Commons Attribution-ShareAlike 4.0 International
CC BY-SA 4.0
You may use, share, and adapt these weights, including commercially, provided that you give appropriate credit and distribute adapted versions under CC BY-SA 4.0 or a compatible license.
This license applies only to the ProtoMorph head weights and related files released in this repository.
It does not apply to:
- DINOv3
- PyTorch
- Hugging Face Transformers
- third-party datasets
- third-party model weights
- upstream dependencies
DINOv3 is not redistributed in this repository. Users are responsible for obtaining DINOv3 separately and complying with its license.
Attribution
If you use this model or build on it, please credit:
ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification
Author: shiowo
Repository: https://huggingface.co/shiowo/DINO-Protomorph
BibTeX:
@software{protomorph_dino_2026,
title = {ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification},
author = {shiowo},
year = {2026},
url = {https://huggingface.co/shiowo/DINO-Protomorph}
}
Disclaimer
This is a research prototype.
The model is provided for experimentation and educational use. It should not be used in production or high-stakes environments without independent validation, dataset auditing, robustness testing, and bias evaluation.
- Downloads last month
- 16
Model tree for shiowo/DINO-Protomorph
Base model
facebook/dinov3-vit7b16-pretrain-lvd1689m