MTP-LM Collection Models to accompany research paper on training multi token prediction language models using self-distillation. • 6 items • Updated 4 days ago • 3
SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model Paper • 2602.21818 • Published 4 days ago • 46 • 7
Video Foundation Models Collection A list of all the (usable) video generation diffusion models. Models that are not upto current standards are skipped. • 11 items • Updated 3 days ago • 2
SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model Paper • 2602.21818 • Published 4 days ago • 46 • 7
JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation Paper • 2602.19163 • Published 7 days ago • 13
SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model Paper • 2602.21818 • Published 4 days ago • 46
video-SALMONN 2 Collection video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions. • 11 items • Updated 6 days ago • 1