Rui-Jie Zhu's picture

Rui-Jie Zhu

ridger

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models

upvoted a paper 3 days ago

Large Language Models Explore by Latent Distilling

upvoted a collection about 1 month ago

Nemotron-Cascade 2

View all activity

Organizations

upvoted 2 papers 3 days ago

How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models

Paper • 2604.21106 • Published 9 days ago • 7

Large Language Models Explore by Latent Distilling

Paper • 2604.24927 • Published 9 days ago • 68

upvoted a collection about 1 month ago

Nemotron-Cascade 2

Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation • 4 items • Updated 15 days ago • 50

upvoted a collection 2 months ago

Qwen3.5

21 items • Updated Mar 9 • 1.59k

upvoted a paper 2 months ago

Helios: Real Real-Time Long Video Generation Model

Paper • 2603.04379 • Published Mar 4 • 186

upvoted 5 papers 3 months ago

Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation

Paper • 2602.03619 • Published Feb 3 • 28

LoopViT: Scaling Visual ARC with Looped Transformers

Paper • 2602.02156 • Published Feb 2 • 12

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published Feb 2 • 268

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

Paper • 2601.21420 • Published Jan 29 • 42

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 68

upvoted a collection 4 months ago

OpenThinker-Agent

5 items • Updated Dec 6, 2025 • 9

upvoted a paper 4 months ago

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Paper • 2512.24617 • Published Dec 31, 2025 • 66

upvoted 2 papers 5 months ago

Universal Reasoning Model

Paper • 2512.14693 • Published Dec 16, 2025 • 44

MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published Dec 16, 2025 • 121

upvoted 5 papers 6 months ago

Virtual Width Networks

Paper • 2511.11238 • Published Nov 14, 2025 • 39

Motif 2 12.7B technical report

Paper • 2511.07464 • Published Nov 7, 2025 • 40

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 132

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 229

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

Paper • 2510.24824 • Published Oct 28, 2025 • 17

upvoted a collection 6 months ago

Ouro

a family of pre-trained Looped Language Models. • 4 items • Updated Oct 29, 2025 • 30