Learning to Refocus with Video Diffusion Models

This repository contains the model weights for the paper Learning to Refocus with Video Diffusion Models.

Project Page | GitHub Repository

Summary

Focus is a cornerstone of photography, yet autofocus systems often fail to capture the intended subject, and users frequently wish to adjust focus after capture. This work introduces a novel method for realistic post-capture refocusing using video diffusion models. From a single defocused image, the approach generates a perceptually accurate focal stack, represented as a video sequence, enabling interactive refocusing and unlocking a range of downstream applications.

Usage

For detailed environment setup, training, and testing instructions, please refer to the official GitHub repository. The model utilizes fine-tuned Stable Video Diffusion (SVD) weights.

Citation

If you use our dataset, code, or model in your research, please cite the following paper:

@inproceedings{Tedla2025Refocus,
  title={{Learning to Refocus with Video Diffusion Models}},
  author={{Tedla, SaiKiran and Zhang, Zhoutong and Zhang, Xuaner and Xin, Shumian}},
  booktitle={{Proceedings of the ACM SIGGRAPH Asia Conference}},
  year={{2025}}
}

Downloads last month: -

Space using tedlasai/learn2refocus 1

Paper for tedlasai/learn2refocus

Learning to Refocus with Video Diffusion Models

Paper • 2512.19823 • Published Dec 22, 2025 • 1