SSL4RL Datasets and models in the paper SSL4RL: Revisiting Self-supervised Learning as Intrinsic Reward for Visual-Language Reasoning PKU-ML/SSL4RL-MMBench-Position-3B Image-Text-to-Text • 4B • Updated Dec 23, 2025 • 1 PKU-ML/SSL4RL-MMBench-Rotation-3B Image-Text-to-Text • 4B • Updated Dec 23, 2025 • 10 PKU-ML/SSL4RL-MMBench-Contrastive-3B Image-Text-to-Text • 4B • Updated Dec 23, 2025 • 6 PKU-ML/SSL4RL-MMBench-Jigsaw-3B Image-Text-to-Text • 4B • Updated Dec 23, 2025 • 2
G1 Portfolio of models, datasets and demos presented in the paper G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning PKU-ML/G1-7B Text Generation • 8B • Updated Jun 17, 2025 • 11 • 2 PKU-ML/G1-3B Text Generation • 3B • Updated Jun 17, 2025 • 9 • 1 PKU-ML/G1-Direct-SFT-3B Text Generation • 3B • Updated Jun 17, 2025 • 8 PKU-ML/G1-Direct-SFT-7B Text Generation • 8B • Updated Jun 17, 2025 • 5
SSL4RL Datasets and models in the paper SSL4RL: Revisiting Self-supervised Learning as Intrinsic Reward for Visual-Language Reasoning PKU-ML/SSL4RL-MMBench-Position-3B Image-Text-to-Text • 4B • Updated Dec 23, 2025 • 1 PKU-ML/SSL4RL-MMBench-Rotation-3B Image-Text-to-Text • 4B • Updated Dec 23, 2025 • 10 PKU-ML/SSL4RL-MMBench-Contrastive-3B Image-Text-to-Text • 4B • Updated Dec 23, 2025 • 6 PKU-ML/SSL4RL-MMBench-Jigsaw-3B Image-Text-to-Text • 4B • Updated Dec 23, 2025 • 2
G1 Portfolio of models, datasets and demos presented in the paper G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning PKU-ML/G1-7B Text Generation • 8B • Updated Jun 17, 2025 • 11 • 2 PKU-ML/G1-3B Text Generation • 3B • Updated Jun 17, 2025 • 9 • 1 PKU-ML/G1-Direct-SFT-3B Text Generation • 3B • Updated Jun 17, 2025 • 8 PKU-ML/G1-Direct-SFT-7B Text Generation • 8B • Updated Jun 17, 2025 • 5