Yiding Shi

snoopd

2 1

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes

updated a model about 1 year ago

snoopd/Reinforce-07212025

published a model about 1 year ago

snoopd/Reinforce-07212025

View all activity

Organizations

None yet

upvoted a paper about 2 months ago

DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes

Paper • 2605.28421 • Published May 27 • 48

updated a model about 1 year ago

snoopd/Reinforce-07212025

Reinforcement Learning • Updated Jul 21, 2025

published a model about 1 year ago

snoopd/Reinforce-07212025

Reinforcement Learning • Updated Jul 21, 2025

updated a model about 1 year ago

snoopd/dqn-SpaceInvadersNoFrameskip-v4

Reinforcement Learning • Updated Jul 14, 2025

published a model about 1 year ago

snoopd/dqn-SpaceInvadersNoFrameskip-v4

Reinforcement Learning • Updated Jul 14, 2025

updated a model about 1 year ago

snoopd/gymnasium-rl-v1

Reinforcement Learning • Updated Jul 11, 2025

published a model about 1 year ago

snoopd/gymnasium-rl-v1

Reinforcement Learning • Updated Jul 11, 2025

updated 2 models about 1 year ago

snoopd/distilbert-base-uncased-lora-text-classification

Updated Jul 7, 2025 • 1

snoopd/distilbert-base-uncased-lora-text-classification_test

Updated Jul 7, 2025

published 2 models about 1 year ago

snoopd/distilbert-base-uncased-lora-text-classification

Updated Jul 7, 2025 • 1

snoopd/distilbert-base-uncased-lora-text-classification_test

Updated Jul 7, 2025

liked a dataset over 1 year ago

fka/prompts.chat

Viewer • Updated about 3 hours ago • 2.06k • 32k • 9.77k

upvoted an article almost 2 years ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

natolambert, LouisCastricato, lvwerra, Dahoas

•

Dec 9, 2022

• 419

Yiding Shi

AI & ML interests

Recent Activity

Organizations

snoopd's activity

Illustrating Reinforcement Learning from Human Feedback (RLHF)