Update model card with pipeline tag, license, and resources
Browse filesHi! I'm Niels from the Hugging Face community team.
I've updated the model card to improve its metadata and discoverability. This PR:
- Adds the `image-to-video` pipeline tag.
- Adds `diffusers` as the library name (based on the `config.json` evidence).
- Updates the license to `cc-by-nc-sa-4.0` to match the project's official documentation.
- Adds a link to the official GitHub repository.
- Includes a sample usage section based on the inference scripts.
Feel free to merge if this looks good!
README.md
CHANGED
|
@@ -1,11 +1,54 @@
|
|
| 1 |
---
|
| 2 |
-
license:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
| 4 |
# HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions
|
| 5 |
|
| 6 |
<a href="https://arxiv.org/abs/2505.22977"><img src='https://img.shields.io/badge/arXiv-2505.22977-red?style=flat&logo=arXiv&logoColor=red' alt='arxiv'></a>
|
| 7 |
<a href='https://vivocameraresearch.github.io/hypermotion/'>
|
| 8 |
<img src='https://img.shields.io/badge/Project-Page-pink?style=flat&logo=Google%20chrome&logoColor=pink'></a>
|
| 9 |
-
<a href="
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
license: cc-by-nc-sa-4.0
|
| 3 |
+
library_name: diffusers
|
| 4 |
+
pipeline_tag: image-to-video
|
| 5 |
+
tags:
|
| 6 |
+
- human-animation
|
| 7 |
+
- pose-guided
|
| 8 |
+
- DiT
|
| 9 |
---
|
| 10 |
+
|
| 11 |
# HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions
|
| 12 |
|
| 13 |
<a href="https://arxiv.org/abs/2505.22977"><img src='https://img.shields.io/badge/arXiv-2505.22977-red?style=flat&logo=arXiv&logoColor=red' alt='arxiv'></a>
|
| 14 |
<a href='https://vivocameraresearch.github.io/hypermotion/'>
|
| 15 |
<img src='https://img.shields.io/badge/Project-Page-pink?style=flat&logo=Google%20chrome&logoColor=pink'></a>
|
| 16 |
+
<a href="https://github.com/vivoCameraResearch/Hyper-Motion"><img src='https://img.shields.io/badge/Github-Code-blue?style=flat&logo=github&logoColor=white' alt='Github'></a>
|
| 17 |
+
<a href="https://creativecommons.org/licenses/by-nc-sa/4.0/"><img src='https://img.shields.io/badge/License-CC BY--NC--SA--4.0-lightgreen?style=flat&logo=Lisence' alt='License'></a>
|
| 18 |
+
|
| 19 |
+
This repository contains the model weights for **HyperMotion**, presented in the paper [HyperMotionX: The Dataset and Benchmark with DiT-Based Pose-Guided Human Image Animation of Complex Motions](https://huggingface.co/papers/2505.22977).
|
| 20 |
+
|
| 21 |
+
## Introduction
|
| 22 |
+
|
| 23 |
+
Recent advances in diffusion models have significantly improved conditional video generation, particularly in the pose-guided human image animation task. Although existing methods are capable of generating high-fidelity and time-consistent animation sequences in regular motions and static scenes, there are still obvious limitations when facing complex human body motions (Hypermotion) that contain highly dynamic, non-standard motions.
|
| 24 |
+
|
| 25 |
+
To address this challenge, we introduce the **Open-HyperMotionX Dataset** and **HyperMotionX Bench**, which provide high-quality human pose annotations and curated video clips for evaluating and improving pose-guided human image animation models under complex human motion conditions. Furthermore, we propose a simple yet powerful DiT-based video generation baseline adopting [Wan2.1-I2V-14B](https://github.com/Wan-Video/Wan2.1) as the base model and design spatial low-frequency enhanced RoPE.
|
| 26 |
+
|
| 27 |
+
## Inference
|
| 28 |
+
|
| 29 |
+
To use the model, you can refer to the inference scripts provided in the official [GitHub repository](https://github.com/vivoCameraResearch/Hyper-Motion).
|
| 30 |
+
|
| 31 |
+
```python
|
| 32 |
+
import torch
|
| 33 |
+
|
| 34 |
+
# Config and model path
|
| 35 |
+
config_path = "config/wan2.1/wan_civitai.yaml"
|
| 36 |
+
model_name = "shuolin/HyperMotion" # model checkpoints
|
| 37 |
+
|
| 38 |
+
# Use torch.float16 if GPU does not support torch.bfloat16
|
| 39 |
+
weight_dtype = torch.bfloat16
|
| 40 |
+
control_video = "path/to/pose_video.mp4" # guided pose video
|
| 41 |
+
ref_image = "path/to/image.jpg" # reference image
|
| 42 |
+
|
| 43 |
+
# For detailed implementation, please refer to scripts/inference.py in the official repo.
|
| 44 |
+
```
|
| 45 |
|
| 46 |
+
## Citation
|
| 47 |
+
```bibtex
|
| 48 |
+
@article{xu2025hypermotion,
|
| 49 |
+
title={Hypermotion: Dit-based pose-guided human image animation of complex motions},
|
| 50 |
+
author={Xu, Shuolin and Zheng, Siming and Wang, Ziyi and Yu, HC and Chen, Jinwei and Zhang, Huaqi, and Zhou Daquan, and Tong-Yee Lee, and Li, Bo and Jiang, Peng-Tao},
|
| 51 |
+
journal={arXiv preprint arXiv:2505.22977},
|
| 52 |
+
year={2025}
|
| 53 |
+
}
|
| 54 |
+
```
|