CubePart: An Open-Vocabulary Part-Controllable 3D Generator Paper • 2605.28763 • Published 3 days ago • 11
From Pixels to Words -- Towards Native One-Vision Models at Scale Paper • 2605.28820 • Published 3 days ago • 65
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 3 days ago • 231
EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM Paper • 2312.06660 • Published Dec 11, 2023 • 2
InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation Paper • 2509.24663 • Published Sep 29, 2025 • 18
Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling Paper • 2604.23586 • Published Apr 26 • 6
Lance MLX Collection Feature-complete MLX port of ByteDance Lance: t2i, image_edit, x2t_image, t2v, video_edit, x2t_video. • 4 items • Updated 8 days ago • 4
Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild Paper • 2605.22064 • Published 9 days ago • 5
Lance: Unified Multimodal Modeling by Multi-Task Synergy Paper • 2605.18678 • Published 12 days ago • 75
DeepVQE: Real Time Deep Voice Quality Enhancement for Joint Acoustic Echo Cancellation, Noise Suppression and Dereverberation Paper • 2306.03177 • Published Jun 5, 2023 • 1
ERNIE-Image Collection The serieas of image generation models, including text2img、img2img. • 4 items • Updated 9 days ago • 24
One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution Paper • 2511.17138 • Published Nov 21, 2025 • 2
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale Paper • 2504.16030 • Published Apr 22, 2025 • 38