Wenkai Yang

Keven16

8 24 1

https://keven980716.github.io/

keven980716

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Filter, Then Reweight: Rethinking Optimization Granularity in On-Policy Distillation

authored a paper about 2 months ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

upvoted a paper about 2 months ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

Filter, Then Reweight: Rethinking Optimization Granularity in On-Policy Distillation

Paper • 2606.02684 • Published Jun 1 • 17

authored a paper about 2 months ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Paper • 2606.04703 • Published Jun 3 • 26

upvoted a paper about 2 months ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Paper • 2606.04703 • Published Jun 3 • 26

submitted a paper to Daily Papers about 2 months ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Paper • 2606.04703 • Published Jun 3 • 26

New activity in Keven16/Qwen3-4B-Non-Thinking-RL-Math-Step500 2 months ago

What is the data source used for training this model?

#1 opened 2 months ago by

KouShi2

authored a paper 3 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 114

upvoted a paper 3 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 114

authored a paper 4 months ago

AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents

Paper • 2603.14465 • Published Mar 15 • 23

updated a dataset 4 months ago

Keven16/OPSD-Example-Data

Viewer • Updated Mar 18 • 49.1k • 38

published a dataset 4 months ago

Keven16/OPSD-Example-Data

Viewer • Updated Mar 18 • 49.1k • 38

upvoted a paper 4 months ago

AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents

Paper • 2603.14465 • Published Mar 15 • 23

updated a model 4 months ago

Keven16/Qwen3-4B-Non-Thinking-RL-Code-Step1200

4B • Updated Mar 16

published a model 4 months ago

Keven16/Qwen3-4B-Non-Thinking-RL-Code-Step1200

4B • Updated Mar 16

updated 2 models 4 months ago

Keven16/Qwen3-4B-Non-Thinking-RL-Code-Step300

4B • Updated Mar 16 • 210 • 1

Keven16/Qwen3-4B-Non-Thinking-RL-Math-Step1200

4B • Updated Mar 16

published a model 4 months ago

Keven16/Qwen3-4B-Non-Thinking-RL-Math-Step1200

4B • Updated Mar 16

updated a model 4 months ago

Keven16/Qwen3-4B-Non-Thinking-RL-Math-Step500

4B • Updated Mar 16 • 507

published 2 models 4 months ago

Keven16/Qwen3-4B-Non-Thinking-RL-Code-Step300

4B • Updated Mar 16 • 210 • 1

Keven16/Qwen3-4B-Non-Thinking-RL-Math-Step500

4B • Updated Mar 16 • 507

liked a dataset 4 months ago

LulaCola/AgentProcessBench

Viewer • Updated Mar 18 • 1k • 263 • 16

Wenkai Yang

AI & ML interests

Recent Activity

Organizations

Keven16's activity

What is the data source used for training this model?