ProtoMorph-DINO

Feedback-Gated Prototype Morphing for Hard-Case Image Classification

ProtoMorph-DINO is an experimental image classification head designed to run on top of a frozen DINOv3 vision backbone.

This model card is for the Hugging Face repository:

shiowo/DINO-Protomorph

This repository currently contains an initial research scaffold and custom ProtoMorph head checkpoint. Evaluation results are pending because the repository is being created before full training and benchmarking.

This project is independent and is not affiliated with Meta AI, Hugging Face, or the official DINOv3 project.

Architecture

Image
↓
Frozen DINOv3
↓
Patch map z0
↓
ProtoMorph block 1
↓
Layer Memory Attention
↓
ProtoMorph block 2
↓
Layer Memory Attention
↓
Main logits
↓
Hard-case gate
    ├── easy: return main logits
    └── hard:
          feedback from top-2 probabilities
          modulate DINO patch map
          run Delta-RBF hard expert
          fuse logits

Model Summary

ProtoMorph-DINO explores whether a frozen foundation vision backbone can be improved with a custom hard-case refinement head.

For easy images, the model returns the main classifier output directly. For difficult or ambiguous images, the model activates a feedback branch. The feedback branch uses the top-2 predicted probabilities to modulate the DINO patch map, sends the modified representation through a Delta-RBF hard expert, and fuses the refined logits with the main logits.

The main research question is whether feedback-guided hard-case refinement can improve classification performance over simpler frozen-backbone heads such as a linear probe or MLP classifier.

Current Status

Status: research scaffold / pre-training setup

The current checkpoint may be randomly initialized or only intended for smoke testing unless a later release says otherwise.

Predictions are not meaningful until the ProtoMorph head is trained on a real dataset.

Results

Evaluation results: Pending

No benchmark results are reported yet because the repository is being prepared before training and evaluation.

Metric	Value
Accuracy	Pending
F1	Pending
Precision	Pending
Recall	Pending
Confusion-pair improvement	Pending
Hard-case routing benefit	Pending

Recommended future baselines:

Baseline	Purpose
DINOv3 + Linear Probe	Minimal frozen-backbone baseline
DINOv3 + MLP Head	Strong simple head baseline
CLIP + Linear Probe	Popular vision-language comparison
ConvNeXt	Strong CNN-style baseline
ViT	Standard transformer baseline

Intended Use

This model is intended for:

image classification research
hard-example routing experiments
prototype learning experiments
frozen-backbone classifier research
fine-grained classification experiments
educational computer vision experiments

This model is not intended for safety-critical use.

Do not use this model for medical, legal, financial, biometric, security-critical, or production decisions without independent validation.

Model Files

Recommended repository layout:

.
├── README.md
├── LICENSE-WEIGHTS.md
├── config.json
├── labels.txt
├── checkpoints/
│   ├── config.json
│   ├── labels.txt
│   └── protomorph_head.safetensors
├── infer.py
├── scripts/
│   └── upload_to_hf.py
└── src/
    └── protomorph/

The main weight file is:

checkpoints/protomorph_head.safetensors

This file contains only the custom ProtoMorph classification head.

DINOv3 backbone weights are not included in this repository.

Backbone

Default backbone:

facebook/dinov3-vits16-pretrain-lvd1689m

The backbone is used as a frozen visual feature extractor.

For RTX 3090-class GPUs, ViT-S/16 is a practical starting point because it keeps VRAM usage manageable while still producing useful patch embeddings.

Installation

Recommended environment:

Python 3.11
PyTorch 2.4.0
CUDA 12.4 PyTorch wheel

Install PyTorch:

pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu124

Install dependencies:

pip install -r requirements-core.txt

RunPod Environment Variables

This project supports the RunPod environment variable names shown below:

hf_key=hf_your_huggingface_write_token_here
hf_repo=shiowo/DINO-Protomorph

Standard Hugging Face names are also supported:

HF_TOKEN=hf_your_huggingface_write_token_here
HF_REPO_ID=shiowo/DINO-Protomorph

Never commit your real Hugging Face token to the repository.

Inference

Run inference from the command line:

python infer.py \
  --image examples/sample_image.jpg \
  --config checkpoints/config.json \
  --checkpoint checkpoints/protomorph_head.safetensors \
  --labels checkpoints/labels.txt \
  --topk 5

For smoke testing only:

python infer.py --image examples/sample_image.jpg --allow-random-head

If the head is untrained, the output is only useful for checking that the pipeline runs.

Upload to Hugging Face from RunPod

After setting hf_key and hf_repo in RunPod, run:

cd /workspace/protomorph_dinov3_runpod
source .venv/bin/activate
python scripts/upload_to_hf.py

Or use the helper script:

bash runpod/upload_to_hf.sh

Dry run before upload:

python scripts/upload_to_hf.py --dry-run

Config Example

{
  "dino_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
  "num_classes": 10,
  "embed_dim": 384,
  "patch_size": 16,
  "proto_count": 64,
  "memory_tokens": 16,
  "rbf_count": 128,
  "num_heads": 8,
  "dropout": 0.0,
  "hard_pmax_threshold": 0.65,
  "hard_margin_threshold": 0.15,
  "hard_entropy_threshold": 1.35,
  "image_size": 512,
  "use_bf16_autocast": true,
  "normalize_patch_tokens": true
}

Limitations

Known limitations:

The architecture is experimental.
Evaluation results are pending.
The hard-case gate requires threshold tuning.
The Delta-RBF hard expert may overfit small datasets.
Inference may be slower for hard samples.
The model should be compared against simple baselines before claiming improvement.
This repository does not include DINOv3 weights.
The custom head may not generalize outside the dataset it was trained on.

License

The ProtoMorph head weights in this repository are released under:

Creative Commons Attribution-ShareAlike 4.0 International
CC BY-SA 4.0

You may use, share, and adapt these weights, including commercially, provided that you give appropriate credit and distribute adapted versions under CC BY-SA 4.0 or a compatible license.

This license applies only to the ProtoMorph head weights and related files released in this repository.

It does not apply to:

DINOv3
PyTorch
Hugging Face Transformers
third-party datasets
third-party model weights
upstream dependencies

DINOv3 is not redistributed in this repository. Users are responsible for obtaining DINOv3 separately and complying with its license.

Attribution

If you use this model or build on it, please credit:

ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification
Author: shiowo
Repository: https://huggingface.co/shiowo/DINO-Protomorph

BibTeX:

@software{protomorph_dino_2026,
  title = {ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification},
  author = {shiowo},
  year = {2026},
  url = {https://huggingface.co/shiowo/DINO-Protomorph}
}

Disclaimer

This is a research prototype.

The model is provided for experimentation and educational use. It should not be used in production or high-stakes environments without independent validation, dataset auditing, robustness testing, and bias evaluation.

Downloads last month: 16

Model tree for shiowo/DINO-Protomorph

Base model

facebook/dinov3-vit7b16-pretrain-lvd1689m

Finetuned

facebook/dinov3-vits16-pretrain-lvd1689m

Finetuned

(11)

this model