Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF Image-Text-to-Text • 9B • Updated 3 days ago • 55.1k • 144
Manifold-Aware Exploration for Reinforcement Learning in Video Generation Paper • 2603.21872 • Published 2 days ago • 32
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 2 days ago • 100
llmfan46/Qwen3.5-40B-Claude-4.5-Opus-High-Reasoning-Thinking-uncensored-heretic Image-Text-to-Text • 40B • Updated 10 days ago • 362 • 2
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper • 2603.12255 • Published 13 days ago • 90
Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation Paper • 2603.12247 • Published 13 days ago • 23
DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning Paper • 2603.12257 • Published 13 days ago • 31
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 19 days ago • 115
HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising Paper • 2603.08703 • Published 16 days ago • 31
CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing Paper • 2603.08589 • Published 16 days ago • 38
Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling Paper • 2602.09084 • Published Feb 9 • 27
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published Feb 9 • 39