Amélie Dubois's picture

Amélie Dubois

page-watcher

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 15 hours ago

WebGen-R1: Incentivizing Large Language Models to Generate Functional and Aesthetic Websites with Reinforcement Learning

upvoted a paper 1 day ago

DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off

upvoted a paper 11 days ago

Small Vision-Language Models are Smart Compressors for Long Video Understanding

View all activity

Organizations

None yet

upvoted a paper about 15 hours ago

WebGen-R1: Incentivizing Large Language Models to Generate Functional and Aesthetic Websites with Reinforcement Learning

Paper • 2604.20398 • Published 3 days ago • 3

upvoted a paper 1 day ago

DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off

Paper • 2604.13902 • Published 10 days ago • 59

upvoted a paper 11 days ago

Small Vision-Language Models are Smart Compressors for Long Video Understanding

Paper • 2604.08120 • Published 16 days ago • 20

upvoted a paper 13 days ago

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published 16 days ago • 259

upvoted a paper 15 days ago

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Paper • 2603.26164 • Published 28 days ago • 359

upvoted a paper 24 days ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 342

upvoted 2 papers about 1 month ago

Demystifing Video Reasoning

Paper • 2603.16870 • Published Mar 17 • 370

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 210

upvoted 2 papers about 2 months ago

Believe Your Model: Distribution-Guided Confidence Calibration

Paper • 2603.03872 • Published Mar 4 • 40

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 519

upvoted 5 papers 2 months ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220

SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise

Paper • 2602.12783 • Published Feb 13 • 216

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published Feb 11 • 244

TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents

Paper • 2602.07274 • Published Feb 6 • 210

The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies

Paper • 2602.09877 • Published Feb 10 • 197