OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning Paper • 2606.26790 • Published 8 days ago • 53
Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation Paper • 2606.26907 • Published 8 days ago • 49
Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Paper • 2606.11025 • Published 24 days ago • 41
Tencent-Hunyuan-Multimodal-RL/FLUX2-klein-base-9b-GenEval2-Multi-Reward Text-to-Image • Updated 22 days ago • 62 • 2
Tencent-Hunyuan-Multimodal-RL/FLUX2-klein-base-9b-GenEval2-Single-Reward Text-to-Image • Updated 22 days ago • 54 • 1
Flow-DPPO: GenEval2 Collection Flow-DPPO-trained LoRA adapters (single- and multi-reward) for SD3.5 and FLUX.2-klein-9B optimized on GenEval2. • 5 items • Updated 22 days ago
Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Paper • 2606.11025 • Published 24 days ago • 41
Flow-DPPO: GenEval2 Collection Flow-DPPO-trained LoRA adapters (single- and multi-reward) for SD3.5 and FLUX.2-klein-9B optimized on GenEval2. • 5 items • Updated 22 days ago
Tencent-Hunyuan-Multimodal-RL/FLUX2-klein-base-9b-GenEval2-Multi-Reward Text-to-Image • Updated 22 days ago • 62 • 2
Tencent-Hunyuan-Multimodal-RL/FLUX2-klein-base-9b-GenEval2-Single-Reward Text-to-Image • Updated 22 days ago • 54 • 1