Alper PRO

alperiox

·

AI & ML interests

LLMs and diffusion models.

Recent Activity

updated a dataset 5 days ago

tiny-aya-translate/tr-hi-mimi-encoded

updated a model 22 days ago

alperiox/tinyaya-s2s-stage2

published a model 27 days ago

alperiox/tinyaya-s2s-stage2

View all activity

Organizations

upvoted a collection over 1 year ago

Orpheus TTS

TTS Towards Human-Sounding Speech • 2 items • Updated Mar 18, 2025 • 78

upvoted a paper almost 2 years ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 151

upvoted a collection about 2 years ago

Searching for Better ViT Baselines

Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). • 36 items • Updated Jan 28 • 20

upvoted 12 papers over 2 years ago

Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math

Paper • 2312.17120 • Published Dec 28, 2023 • 28

Learning Vision from Models Rivals Learning Vision from Data

Paper • 2312.17742 • Published Dec 28, 2023 • 16

PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation

Paper • 2312.17276 • Published Dec 27, 2023 • 16

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Paper • 2312.17681 • Published Dec 29, 2023 • 19

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions

Paper • 2401.01827 • Published Jan 3, 2024 • 17

SIGNeRF: Scene Integrated Generation for Neural Radiance Fields

Paper • 2401.01647 • Published Jan 3, 2024 • 13

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 28

Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

Paper • 2401.01117 • Published Jan 2, 2024 • 10

VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM

Paper • 2401.01256 • Published Jan 2, 2024 • 22

LLaMA Beyond English: An Empirical Study on Language Capability Transfer

Paper • 2401.01055 • Published Jan 2, 2024 • 54

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 192

TrailBlazer: Trajectory Control for Diffusion-Based Video Generation

Paper • 2401.00896 • Published Dec 31, 2023 • 15