15 103

Leo PRO

leideng

https://leideng.github.io/

leideng
lei-deng-0537564b

AI & ML interests

Efficient AI, Sparse Attention

Recent Activity

liked a model about 7 hours ago

google/umt5-xxl

liked a model 1 day ago

facebook/cwm

updated a collection 2 days ago

SFT

View all activity

Organizations

None yet

Leo PRO

AI & ML interests

Recent Activity

Organizations

leideng 's collections 7

Wan: Open and Advanced Large-Scale Video Generative Models

LFM2 Technical Report

Sequence to Sequence Learning with Neural Networks

Language Models are Few-Shot Learners

s1: Simple test-time scaling

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Proximal Policy Optimization Algorithms

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

A Formal Perspective on Byte-Pair Encoding

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Training language models to follow instructions with human feedback

LIMA: Less Is More for Alignment

Preserving Diversity in Supervised Fine-Tuning of Large Language Models

Wan: Open and Advanced Large-Scale Video Generative Models

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Proximal Policy Optimization Algorithms

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

LFM2 Technical Report

A Formal Perspective on Byte-Pair Encoding

Sequence to Sequence Learning with Neural Networks

Language Models are Few-Shot Learners

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Training language models to follow instructions with human feedback

LIMA: Less Is More for Alignment

Preserving Diversity in Supervised Fine-Tuning of Large Language Models

s1: Simple test-time scaling