8 40

Koi

KOIIIII

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

liked a model 16 days ago

Qwen/Qwen3-30B-A3B-Instruct-2507

liked a model about 1 month ago

dstx123/xtrainer-leisaac

View all activity

Organizations

None yet

upvoted a paper 1 day ago

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Paper • 2604.01658 • Published 3 days ago • 40

upvoted a paper 4 months ago

PIPPA: A Partially Synthetic Conversational Dataset

Paper • 2308.05884 • Published Aug 11, 2023 • 34

upvoted a collection 5 months ago

InternVL3.5

Collection

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 45 items • Updated Mar 2 • 107

upvoted an article 9 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

•

1.11k

upvoted an article 11 months ago

Article

The N Implementation Details of RLHF with PPO

Oct 24, 2023

•

upvoted 3 papers 11 months ago

Nash Learning from Human Feedback

Paper • 2312.00886 • Published Dec 1, 2023 • 18

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 142

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 59

Koi

AI & ML interests

Recent Activity

Organizations

KOIIIII's activity

Mixture of Experts Explained

The N Implementation Details of RLHF with PPO