EverAnimate: Minute-Scale Human Animation via Latent Flow Restoration
Abstract
EverAnimate addresses long-horizon animated video generation challenges through persistent latent propagation and restorative flow matching to maintain visual quality and character identity.
We propose EverAnimate, an efficient post-training method for long-horizon animated video generation that preserves visual quality and character identity. Long-form animation remains challenging because highly dynamic human motion must be synthesized against relatively static environments, making chunk-based generation prone to accumulated drift: (i) low-level quality drift, such as progressive degradation of static backgrounds, and (ii) high-level semantic drift, such as inconsistent character identity and view-dependent attributes. To address this issue, EverAnimate restores drifted flow trajectories by anchoring generation to a persistent latent context memory, consisting of two complementary mechanisms. (i) Persistent Latent Propagation maintains a context memory across chunks to propagate identity and motion in latent space while mitigating temporal forgetting. (ii) Restorative Flow Matching introduces an implicit restoration objective during sampling through velocity adjustment, improving within-chunk fidelity. With only lightweight LoRA tuning, EverAnimate outperforms state-of-the-art long-animation methods in both short- and long-horizon settings: at 10 seconds, it improves PSNR/SSIM by 8%/7% and reduces LPIPS/FID by 22%/11%; at 90 seconds, the gains increase to 15%/15% and 32%/27%, respectively.
Community
EverAnimate code, 480p LoRA checkpoints, minimal demo data, and training/inference scripts are now released: https://huggingface.co/epfl-vita/everanimate
Resources:
- Code: https://github.com/vita-epfl/EverAnimate
- Project page: https://everanimate.github.io/homepage/
- Model and demo data: https://huggingface.co/epfl-vita/everanimate
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- AsymTalker: Identity-Consistent Long-Term Talking Head Generation via Asymmetric Distillation (2026)
- CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives (2026)
- Gloria: Consistent Character Video Generation via Content Anchors (2026)
- Lyra 2.0: Explorable Generative 3D Worlds (2026)
- ONE-SHOT: Compositional Human-Environment Video Synthesis via Spatial-Decoupled Motion Injection and Hybrid Context Integration (2026)
- VISTA: Triplet-Supervised Video Style Transfer with Diffusion Transformers (2026)
- ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2605.15042 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper