ModHiFi Pruned ResNet-50 (Small)

Model Description

This model is a structurally pruned version of the standard ResNet-50 architecture. Developed by the Machine Learning Lab at the Indian Institute of Science, it has been compressed to remove ~30% of the parameters while achieving higher accuracy than the base model.

Unlike unstructured pruning (which zeros out weights), structural pruning physically removes entire channels and filters. This results in a model that is natively smaller, faster, and reduces FLOPs on standard hardware without needing specialized sparse inference engines.

Developed by: Machine Learning Lab, Indian Institute of Science
Model type: Convolutional Neural Network (Pruned ResNet)
License: GNU General Public License v3.0
Base Model: Microsoft ResNet-50

Performance & Efficiency

Model Variant	Sparsity	Top-1 Acc	Top-5 Acc	Params (M)	FLOPs (G)	Size (MB)
Original ResNet-50	0%	76.13%	92.86%	25.56	4.12	~98
ModHiFi-Small	~32%	76.70%	93.32%	17.4	1.9	~66

On the hardware we test on (detailed in our Paper) we observe speedups of 1.69x on CPUs and 1.70x on GPUs.

Note: "FLOPs" measures the number of floating-point operations required for a single inference pass. Lower is better for latency and battery life.

⚠️ Critical Note on Preprocessing & Accuracy

Please Read Before Evaluating: This model was trained and evaluated using standard PyTorch torchvision.transforms. The Hugging Face pipeline uses PIL (Pillow) for image resizing by default.

Due to subtle differences in interpolation (Bilinear vs. Bicubic) and anti-aliasing between PyTorch's C++ kernels and PIL, you may observe a ~0.5% - 1.0% drop in Top-1 accuracy if you use the default preprocessor_config.json.

To reproduce the exact numbers listed in the table above, we recommend wrapping the pipeline with the exact PyTorch transforms used during training:

from torchvision import transforms
from transformers import pipeline
import torch

# 1. Define the Exact PyTorch Transform
val_transform = transforms.Compose([
    transforms.Resize(256),       # Resize shortest edge to 256
    transforms.CenterCrop(224),   # Center crop 224x224
    transforms.ToTensor(),        # Convert to Tensor (0-1)
    transforms.Normalize(         # ImageNet Normalization
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    ),
])

# 2. Define a Wrapper to force Pipeline to use PyTorch
class PyTorchProcessor:
    def __init__(self, transform):
        self.transform = transform
        self.image_processor_type = "custom"

    def __call__(self, images, **kwargs):
        if not isinstance(images, list): images = [images]
        # Apply transforms and stack
        pixel_values = torch.stack([self.transform(img.convert("RGB")) for img in images])
        return {"pixel_values": pixel_values}

# 3. Initialize Pipeline with Custom Processor
pipe = pipeline(
    "image-classification", 
    model="MLLabIISc/ModHiFi-ResNet50-ImageNet-Small", 
    image_processor=PyTorchProcessor(val_transform), # <--- Fixes the accuracy gap
    trust_remote_code=True,
    device=0 # Use GPU if available
)

Quick Start

If you do not require bit-perfect reproduction of the original accuracy and prefer simplicity, you can use the model directly with the standard Hugging Face pipeline.

Install dependencies

pip install torch transformers

Inference example

import requests
from PIL import Image
from transformers import pipeline

# Load model (ensure trust_remote_code=True for custom architecture)
pipe = pipeline(
    "image-classification", 
    model="MLLabIISc/ModHiFi-ResNet50-ImageNet-Small", 
    trust_remote_code=True
)

# Load an image
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

# Run Inference
results = pipe(image)
print(f"Predicted Class: {results[0]['label']}")
print(f"Confidence: {results[0]['score']:.4f}")

Citation

If you use this model in your research, please cite the following paper:

@inproceedings{kashyap2026modhifi,
      title = {ModHiFi: Identifying High Fidelity predictive components for Model Modification}, 
      author = {Kashyap, Dhruva and Murti, Chaitanya and Nayak, Pranav and Narshana, Tanay and Bhattacharyya, Chiranjib},
      booktitle = {Advances in Neural Information Processing Systems},
      year = {2025},
      eprint = {2511.19566},
      archivePrefix = {arXiv},
      primaryClass = {cs.LG},
      url = {https://arxiv.org/abs/2511.19566}, 
}

Downloads last month: -

Model tree for MLLabIISc/ModHiFi-ResNet50-ImageNet-Small

Base model

microsoft/resnet-50

Finetuned

(463)

this model

Dataset used to train MLLabIISc/ModHiFi-ResNet50-ImageNet-Small

Collection including MLLabIISc/ModHiFi-ResNet50-ImageNet-Small

ModHiFi-ImageNet-ResNet50

Collection

ImageNet based ResNet50s pruned using ModHiFi • 2 items • Updated 1 day ago

Paper for MLLabIISc/ModHiFi-ResNet50-ImageNet-Small

ModHiFi: Identifying High Fidelity predictive components for Model Modification

Paper • 2511.19566 • Published Nov 24, 2025 • 1