14 5

TIANYI

BIMU233

http://bimu.site

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Bridging the Agent-World Gap: Text World Models for LLM-based Agents

upvoted a paper 7 days ago

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

upvoted a paper 21 days ago

Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Bridging the Agent-World Gap: Text World Models for LLM-based Agents

Paper • 2606.09032 • Published 3 days ago • 6

upvoted a paper 7 days ago

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Paper • 2605.30288 • Published 13 days ago • 22

upvoted a paper 21 days ago

Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment

Paper • 2605.20834 • Published 22 days ago • 5

upvoted 2 papers about 1 month ago

Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published May 7 • 80

Anchored Policy Optimization: Mitigating Exploration Collapse Via Support-Constrained Rectification

Paper • 2602.05717 • Published Feb 5 • 1

liked a Space about 1 month ago

Croissant Checker - Dev

🔎

Validate Croissant dataset files for NeurIPS submissions

published a model about 1 month ago

BIMU233/GPT-2_agd

Updated Apr 27

updated a model about 1 month ago

BIMU233/GPT-2_agd

Updated Apr 27

liked a model about 2 months ago

Qwen/Qwen3.5-9B

Image-Text-to-Text • 10B • Updated Mar 2 • 8.7M • • 1.55k

upvoted 7 papers about 2 months ago

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Paper • 2604.14268 • Published Apr 15 • 124

From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space

Paper • 2604.14142 • Published Apr 15 • 30

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 165

KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance

Paper • 2604.12627 • Published Apr 14 • 101

authored a paper about 2 months ago

SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks

Paper • 2604.08865 • Published Apr 10 • 29

upvoted 2 papers about 2 months ago

From Word to World: Can Large Language Models be Implicit Text-based World Models?

Paper • 2512.18832 • Published Dec 21, 2025 • 15

SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks

Paper • 2604.08865 • Published Apr 10 • 29

updated a model 2 months ago

BIMU233/0.05_240

8B • Updated Mar 30 • 4

TIANYI

AI & ML interests

Recent Activity

Organizations

BIMU233's activity

Croissant Checker - Dev