Chun-Hsiao Yeh

danielchyeh

6 20 3

https://danielchyeh.github.io/

AI & ML interests

vision-language models, self-supervised learning

Recent Activity

upvoted a paper about 2 months ago

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

upvoted a paper about 2 months ago

GenClaw: Code-Driven Agentic Image Generation

upvoted a paper about 2 months ago

CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation

View all activity

Organizations

upvoted 9 papers about 2 months ago

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

Paper • 2605.30346 • Published May 28 • 56

GenClaw: Code-Driven Agentic Image Generation

Paper • 2605.30248 • Published May 28 • 41

CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation

Paper • 2605.25378 • Published May 25 • 62

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

Paper • 2605.30263 • Published May 28 • 59

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

Paper • 2605.29250 • Published May 28 • 81

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Paper • 2605.30280 • Published May 28 • 146

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Paper • 2605.29801 • Published May 28 • 150

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Paper • 2605.23904 • Published May 22 • 262

Beyond 3D VQAs: Injecting 3D Spatial Priors into Vision-Language Models for Enhanced Geometric Reasoning

Paper • 2605.30231 • Published May 28 • 1

upvoted 3 papers 10 months ago

WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7, 2025 • 144

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 176

LongLive: Real-time Interactive Long Video Generation

Paper • 2509.22622 • Published Sep 26, 2025 • 190

upvoted 3 papers about 1 year ago

Beyond Simple Edits: X-Planner for Complex Instruction-Based Image Editing

Paper • 2507.05259 • Published Jul 7, 2025 • 6

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 125

Decoupled Contrastive Learning

Paper • 2110.06848 • Published Oct 13, 2021 • 1

upvoted a collection over 1 year ago

Multimodal Benchmarks

Collection

254 items • Updated Feb 7 • 29

upvoted a paper over 1 year ago

Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs

Paper • 2504.15280 • Published Apr 21, 2025 • 25

upvoted 2 papers over 2 years ago

Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition

Paper • 2402.15504 • Published Feb 23, 2024 • 21

Magic-Me: Identity-Specific Video Customized Diffusion

Paper • 2402.09368 • Published Feb 14, 2024 • 31

upvoted a paper about 3 years ago

Meta-Personalizing Vision-Language Models to Find Named Instances in Video

Paper • 2306.10169 • Published Jun 16, 2023 • 6

Chun-Hsiao Yeh

AI & ML interests

Recent Activity

Organizations

danielchyeh's activity