lxp

lxpp

32 4

AI & ML interests

None yet

Recent Activity

updated a dataset 11 days ago

m-a-p/MSQA

published a dataset 13 days ago

m-a-p/MSQA

upvoted a paper 20 days ago

CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents

View all activity

Organizations

updated a dataset 11 days ago

m-a-p/MSQA

Updated 10 days ago • 88 • 1

published a dataset 13 days ago

m-a-p/MSQA

Updated 10 days ago • 88 • 1

upvoted a paper 20 days ago

CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents

Paper • 2606.22883 • Published 21 days ago • 37

updated a dataset about 1 month ago

lxpp/all_merged_instructions

Updated Jun 10 • 9

upvoted 4 papers about 1 month ago

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Paper • 2606.08415 • Published Jun 7 • 51

TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation

Paper • 2606.02320 • Published Jun 1 • 15

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Paper • 2606.02060 • Published Jun 1 • 58

MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?

Paper • 2606.01993 • Published Jun 1 • 15

upvoted 2 papers about 2 months ago

OProver: A Unified Framework for Agentic Formal Theorem Proving

Paper • 2605.17283 • Published May 17 • 31

Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

Paper • 2605.15301 • Published May 14 • 22

updated a dataset about 2 months ago

NJU-LINK/WebCompass

Viewer • Updated May 18 • 933 • 21.6k • 6

published a dataset 2 months ago

lxpp/all_merged_instructions

Updated Jun 10 • 9

upvoted 2 papers 2 months ago

SkillOS: Learning Skill Curation for Self-Evolving Agents

Paper • 2605.06614 • Published May 7 • 47

DV-World: Benchmarking Data Visualization Agents in Real-World Scenarios

Paper • 2604.25914 • Published Apr 28 • 42

upvoted 3 papers 3 months ago

liked a dataset 3 months ago

NJU-LINK/WebCompass

Viewer • Updated May 18 • 933 • 21.6k • 6

published a dataset 3 months ago

NJU-LINK/WebCompass

Viewer • Updated May 18 • 933 • 21.6k • 6

upvoted a paper 4 months ago

CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction

Paper • 2603.00610 • Published Feb 28 • 36

lxp

AI & ML interests

Recent Activity

Organizations

lxpp's activity