🔄 In a Training Loop

Michał Wiliński

MWilinski

1 4 40

https://michal-wilinski.com

AI & ML interests

Machine Learning, Reinforcement Learning

Recent Activity

updated a model 21 days ago

MWilinski/qwen2.5-3b-sft-irl

updated a model 24 days ago

MWilinski/qwen2.5-3b-gail

liked a Space 28 days ago

gemma-challenge/gemma-interactions-view

View all activity

Organizations

Papers 3

arxiv:2505.13291

arxiv:2502.06037

arxiv:2409.13530

spaces 3

models 10

MWilinski/qwen2.5-3b-gail-pirate

Updated May 5

MWilinski/qwen2.5-3b-sft-pirate

Updated May 5

MWilinski/qwen2.5-3b-gail-frozen

Updated Apr 18

MWilinski/qwen2.5-3b-gail-unfrozen

Updated Apr 18

MWilinski/qwen2.5-3b-dpo-irl

Updated Apr 16

MWilinski/qwen2.5-3b-sft-irl

Updated Apr 16

MWilinski/qwen2.5-3b-gail

Updated Mar 27

MWilinski/dro-v-qwen3-1.7b-paperlike

Updated Mar 13

MWilinski/dro-qwen3-1.7b-full-fixed-tau

Updated Feb 27

MWilinski/dro-qwen3-1.7b-full

Updated Feb 27

datasets 19

MWilinski/rlhf-irl-pirate-expert

Viewer • Updated Apr 30 • 6k • 11

MWilinski/rlhf-irl

Viewer • Updated Apr 15 • 14k • 67

MWilinski/hh-rlhf-helpful-base-rollouts-gpt-oss-20b-diverse-openrouter

Viewer • Updated Mar 24 • 200 • 87

MWilinski/hh-rlhf-harmless-base-rollouts-gpt-oss-20b-diverse-openrouter

Viewer • Updated Mar 24 • 200 • 83

MWilinski/hh-rlhf-irl

Viewer • Updated Mar 23 • 10k • 22

MWilinski/hh-rlhf-helpful-base-rollouts-gpt-5.1-policy

Viewer • Updated Mar 10 • 2k • 10

MWilinski/hh-rlhf-harmless-base-rollouts-gpt-5.1-policy

Viewer • Updated Mar 10 • 2k • 10

MWilinski/hh-rlhf-helpful-base-rollouts-gpt-5.1-child

Viewer • Updated Mar 10 • 1.5k • 38

MWilinski/hh-rlhf-helpful-base-rollouts-gpt-5.1-adult

Viewer • Updated Mar 10 • 1.5k • 18

MWilinski/hh-rlhf-harmless-base-rollouts-gpt-5.1-adult

Viewer • Updated Mar 10 • 1.5k • 11

View 19 datasets

Michał Wiliński

AI & ML interests

Recent Activity

Organizations

Papers 3

spaces 3 Sort: Recently updated

Urban Autonomy Instance Segmentation

HF-Docs-QA

bit

models 10 Sort: Recently updated

datasets 19 Sort: Recently updated

spaces 3

models 10

datasets 19