EdgeCrafter: Compact ViTs for Edge Dense Prediction

EdgeCrafter is a unified compact ViT framework for edge dense prediction tasks. This repository specifically contains the ECDet-S model, an object detection architecture built from a distilled compact backbone and an edge-friendly encoder-decoder design.

Paper: EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation
Project Page: https://intellindust-ai-lab.github.io/projects/EdgeCrafter/
Repository: https://github.com/Intellindust-AI-Lab/EdgeCrafter

Model Description

EdgeCrafter bridges the accuracy-efficiency gap between compact Vision Transformers (ViTs) and CNN-based architectures (like YOLO) on resource-constrained devices. By employing task-specialized distillation and edge-aware architectural designs, ECDet achieves high performance with minimal parameters. ECDet-S, for instance, reaches 51.7 AP on the COCO dataset with fewer than 10M parameters.

COCO2017 Validation Results (Object Detection)

Model	Size	AP_50:95	#Params	GFLOPs	Latency (ms)
ECDet-S	640	51.7	10	26	5.41
ECDet-M	640	54.3	18	53	7.98
ECDet-L	640	57.0	31	101	10.49
ECDet-X	640	57.9	49	151	12.70

Note: Latency is measured on an NVIDIA T4 GPU with batch size 1 under FP16 precision using TensorRT (v10.6).

Installation

# Create conda environment
conda create -n ec python=3.11 -y
conda activate ec

# Install dependencies
pip install -r requirements.txt

Quick Start (Inference)

You can run inference on a sample image using the provided scripts:

# 1. Download the pre-trained model (if not already present)
# 2. Run PyTorch inference
# Make sure to replace `path/to/your/image.jpg` with an actual image path
python tools/inference/torch_inf.py -c configs/ecdet/ecdet_s.yml -r ecdet_s.pth -i path/to/your/image.jpg

Citation

If you find EdgeCrafter useful in your research, please consider citing:

@article{liu2026edgecrafter,
  title={EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation},
  author={Liu, Longfei and Hou, Yongjie and Li, Yang and Wang, Qirui and Sha, Youyang and Yu, Yongjun and Wang, Yinzhi and Ru, Peizhe and Yu, Xuanlong and Shen, Xi},
  journal={arXiv},
  year={2026}
}

This model has been pushed to the Hub using the PytorchModelHubMixin integration.

Downloads last month: 36

Safetensors

Model size

33M params

Tensor type

I64

F32

BOOL

Paper for Intellindust/ECDet_L

EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation

Paper • 2603.18739 • Published 15 days ago • 11