1 314 46

jasonjiang

mikinyaa

jasonjiang8866

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

upvoted a paper 2 days ago

Wan-Streamer v0.1: End-to-end Real-time Interactive Foundation Models

upvoted a paper 4 days ago

Unlimited OCR Works

View all activity

Organizations

None yet

upvoted 2 papers 2 days ago

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

Paper • 2606.26790 • Published 4 days ago • 45

Wan-Streamer v0.1: End-to-end Real-time Interactive Foundation Models

Paper • 2606.25041 • Published 6 days ago • 93

upvoted a paper 4 days ago

Unlimited OCR Works

Paper • 2606.23050 • Published 7 days ago • 38

liked a dataset 13 days ago

lazarus19/Vibe-Coding-Instruct

Viewer • Updated 10 days ago • 1.1M • 2.3k • 169

liked a model 16 days ago

Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF

Text Generation • 0.5B • Updated 4 days ago • 269k • 308

upvoted a paper 19 days ago

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

Paper • 2606.09079 • Published 21 days ago • 64

upvoted a paper 20 days ago

When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents

Paper • 2606.05806 • Published 25 days ago • 23

upvoted an article 24 days ago

Article

How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent

nvidia

•

24 days ago

• 66

upvoted 2 papers about 1 month ago

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published May 14 • 116

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published May 14 • 91

upvoted 10 papers about 2 months ago

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Paper • 2605.13724 • Published May 13 • 105

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

Paper • 2605.10899 • Published May 11 • 79

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

Paper • 2604.28123 • Published May 1 • 49

OceanPile: A Large-Scale Multimodal Ocean Corpus for Foundation Models

Paper • 2605.00877 • Published Apr 25 • 15

Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction

Paper • 2604.27221 • Published Apr 29 • 40

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

Paper • 2604.22782 • Published Apr 3 • 8

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Paper • 2604.28139 • Published Apr 30 • 42

jasonjiang

AI & ML interests

Recent Activity

Organizations

mikinyaa's activity

How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent