Addition is All You Need for Energy-efficient Language Models Paper β’ 2410.00907 β’ Published Oct 1, 2024 β’ 151
Searching for Better ViT Baselines Collection Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). β’ 36 items β’ Updated Jan 28 β’ 20
Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math Paper β’ 2312.17120 β’ Published Dec 28, 2023 β’ 28
Learning Vision from Models Rivals Learning Vision from Data Paper β’ 2312.17742 β’ Published Dec 28, 2023 β’ 16
PanGu-Ο: Enhancing Language Model Architectures via Nonlinearity Compensation Paper β’ 2312.17276 β’ Published Dec 27, 2023 β’ 16
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis Paper β’ 2312.17681 β’ Published Dec 29, 2023 β’ 19
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions Paper β’ 2401.01827 β’ Published Jan 3, 2024 β’ 17
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields Paper β’ 2401.01647 β’ Published Jan 3, 2024 β’ 13
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations Paper β’ 2401.01885 β’ Published Jan 3, 2024 β’ 28
Q-Refine: A Perceptual Quality Refiner for AI-Generated Image Paper β’ 2401.01117 β’ Published Jan 2, 2024 β’ 10
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM Paper β’ 2401.01256 β’ Published Jan 2, 2024 β’ 22
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper β’ 2401.01055 β’ Published Jan 2, 2024 β’ 54
DocLLM: A layout-aware generative language model for multimodal document understanding Paper β’ 2401.00908 β’ Published Dec 31, 2023 β’ 191
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation Paper β’ 2401.00896 β’ Published Dec 31, 2023 β’ 15