Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Doohyuk Jang
jadohu
AI & ML interests
None yet
Organizations
models
14
jadohu/Qwen2.5-32B-GRPO
Reinforcement Learning
•
33B
•
Updated
jadohu/Qwen3-8B-GRPO
Reinforcement Learning
•
8B
•
Updated
•
6
•
1
jadohu/Qwen3-8B-MASA-efficient
Reinforcement Learning
•
8B
•
Updated
•
5
•
1
jadohu/Qwen3-8B-MASA
Reinforcement Learning
•
8B
•
Updated
•
4
•
2
jadohu/Qwen3-14B-GRPO
Reinforcement Learning
•
15B
•
Updated
•
6
•
1
jadohu/Qwen3-14B-MASA
Reinforcement Learning
•
15B
•
Updated
•
5
•
1
jadohu/Qwen2.5-32B-MASA-efficient
Reinforcement Learning
•
33B
•
Updated
•
1
jadohu/MongMong
Text Generation
•
8B
•
Updated
jadohu/anole_drafter
Text-to-Image
•
0.5B
•
Updated
jadohu/llamagen_drafter
Text-to-Image
•
Updated
•
4
•
1
datasets
0
None public yet