ProtoMorph-DINO

Feedback-Gated Prototype Morphing for Hard-Case Image Classification

ProtoMorph-DINO is an experimental image classification head designed to run on top of a frozen DINOv3 vision backbone.

This model card is for the Hugging Face repository:

shiowo/DINO-Protomorph

This repository currently contains an initial research scaffold and custom ProtoMorph head checkpoint. Evaluation results are pending because the repository is being created before full training and benchmarking.

This project is independent and is not affiliated with Meta AI, Hugging Face, or the official DINOv3 project.


Architecture

Image
↓
Frozen DINOv3
↓
Patch map z0
↓
ProtoMorph block 1
↓
Layer Memory Attention
↓
ProtoMorph block 2
↓
Layer Memory Attention
↓
Main logits
↓
Hard-case gate
    β”œβ”€β”€ easy: return main logits
    └── hard:
          feedback from top-2 probabilities
          modulate DINO patch map
          run Delta-RBF hard expert
          fuse logits

Model Summary

ProtoMorph-DINO explores whether a frozen foundation vision backbone can be improved with a custom hard-case refinement head.

For easy images, the model returns the main classifier output directly. For difficult or ambiguous images, the model activates a feedback branch. The feedback branch uses the top-2 predicted probabilities to modulate the DINO patch map, sends the modified representation through a Delta-RBF hard expert, and fuses the refined logits with the main logits.

The main research question is whether feedback-guided hard-case refinement can improve classification performance over simpler frozen-backbone heads such as a linear probe or MLP classifier.


Current Status

Status: research scaffold / pre-training setup

The current checkpoint may be randomly initialized or only intended for smoke testing unless a later release says otherwise.

Predictions are not meaningful until the ProtoMorph head is trained on a real dataset.


Results

Evaluation results: Pending

No benchmark results are reported yet because the repository is being prepared before training and evaluation.

Metric Value
Accuracy Pending
F1 Pending
Precision Pending
Recall Pending
Confusion-pair improvement Pending
Hard-case routing benefit Pending

Recommended future baselines:

Baseline Purpose
DINOv3 + Linear Probe Minimal frozen-backbone baseline
DINOv3 + MLP Head Strong simple head baseline
CLIP + Linear Probe Popular vision-language comparison
ConvNeXt Strong CNN-style baseline
ViT Standard transformer baseline

Intended Use

This model is intended for:

  • image classification research
  • hard-example routing experiments
  • prototype learning experiments
  • frozen-backbone classifier research
  • fine-grained classification experiments
  • educational computer vision experiments

This model is not intended for safety-critical use.

Do not use this model for medical, legal, financial, biometric, security-critical, or production decisions without independent validation.


Model Files

Recommended repository layout:

.
β”œβ”€β”€ README.md
β”œβ”€β”€ LICENSE-WEIGHTS.md
β”œβ”€β”€ config.json
β”œβ”€β”€ labels.txt
β”œβ”€β”€ checkpoints/
β”‚   β”œβ”€β”€ config.json
β”‚   β”œβ”€β”€ labels.txt
β”‚   └── protomorph_head.safetensors
β”œβ”€β”€ infer.py
β”œβ”€β”€ scripts/
β”‚   └── upload_to_hf.py
└── src/
    └── protomorph/

The main weight file is:

checkpoints/protomorph_head.safetensors

This file contains only the custom ProtoMorph classification head.

DINOv3 backbone weights are not included in this repository.


Backbone

Default backbone:

facebook/dinov3-vits16-pretrain-lvd1689m

The backbone is used as a frozen visual feature extractor.

For RTX 3090-class GPUs, ViT-S/16 is a practical starting point because it keeps VRAM usage manageable while still producing useful patch embeddings.


Installation

Recommended environment:

Python 3.11
PyTorch 2.4.0
CUDA 12.4 PyTorch wheel

Install PyTorch:

pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu124

Install dependencies:

pip install -r requirements-core.txt

RunPod Environment Variables

This project supports the RunPod environment variable names shown below:

hf_key=hf_your_huggingface_write_token_here
hf_repo=shiowo/DINO-Protomorph

Standard Hugging Face names are also supported:

HF_TOKEN=hf_your_huggingface_write_token_here
HF_REPO_ID=shiowo/DINO-Protomorph

Never commit your real Hugging Face token to the repository.


Inference

Run inference from the command line:

python infer.py \
  --image examples/sample_image.jpg \
  --config checkpoints/config.json \
  --checkpoint checkpoints/protomorph_head.safetensors \
  --labels checkpoints/labels.txt \
  --topk 5

For smoke testing only:

python infer.py --image examples/sample_image.jpg --allow-random-head

If the head is untrained, the output is only useful for checking that the pipeline runs.


Upload to Hugging Face from RunPod

After setting hf_key and hf_repo in RunPod, run:

cd /workspace/protomorph_dinov3_runpod
source .venv/bin/activate
python scripts/upload_to_hf.py

Or use the helper script:

bash runpod/upload_to_hf.sh

Dry run before upload:

python scripts/upload_to_hf.py --dry-run

Config Example

{
  "dino_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
  "num_classes": 10,
  "embed_dim": 384,
  "patch_size": 16,
  "proto_count": 64,
  "memory_tokens": 16,
  "rbf_count": 128,
  "num_heads": 8,
  "dropout": 0.0,
  "hard_pmax_threshold": 0.65,
  "hard_margin_threshold": 0.15,
  "hard_entropy_threshold": 1.35,
  "image_size": 512,
  "use_bf16_autocast": true,
  "normalize_patch_tokens": true
}

Limitations

Known limitations:

  • The architecture is experimental.
  • Evaluation results are pending.
  • The hard-case gate requires threshold tuning.
  • The Delta-RBF hard expert may overfit small datasets.
  • Inference may be slower for hard samples.
  • The model should be compared against simple baselines before claiming improvement.
  • This repository does not include DINOv3 weights.
  • The custom head may not generalize outside the dataset it was trained on.

License

The ProtoMorph head weights in this repository are released under:

Creative Commons Attribution-ShareAlike 4.0 International
CC BY-SA 4.0

You may use, share, and adapt these weights, including commercially, provided that you give appropriate credit and distribute adapted versions under CC BY-SA 4.0 or a compatible license.

This license applies only to the ProtoMorph head weights and related files released in this repository.

It does not apply to:

  • DINOv3
  • PyTorch
  • Hugging Face Transformers
  • third-party datasets
  • third-party model weights
  • upstream dependencies

DINOv3 is not redistributed in this repository. Users are responsible for obtaining DINOv3 separately and complying with its license.


Attribution

If you use this model or build on it, please credit:

ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification
Author: shiowo
Repository: https://huggingface.co/shiowo/DINO-Protomorph

BibTeX:

@software{protomorph_dino_2026,
  title = {ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification},
  author = {shiowo},
  year = {2026},
  url = {https://huggingface.co/shiowo/DINO-Protomorph}
}

Disclaimer

This is a research prototype.

The model is provided for experimentation and educational use. It should not be used in production or high-stakes environments without independent validation, dataset auditing, robustness testing, and bias evaluation.

Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for shiowo/DINO-Protomorph