hanoz bhathena
bh9052
AI & ML interests
None yet
Recent Activity
updated a collection about 14 hours ago
Post training updated a collection about 14 hours ago
Post training updated a collection about 14 hours ago
Continual learning Organizations
None yet
CUA
-
OpenComputer: Verifiable Software Worlds for Computer-Use Agents
Paper • 2605.19769 • Published • 55 -
MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents
Paper • 2605.18652 • Published • 7 -
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
Paper • 2605.12481 • Published • 27
Post training
-
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
Paper • 2603.19220 • Published • 69 -
Not Every Rubric Teaches Equally: Policy-Aware Rubric Rewards for RLVR
Paper • 2605.20164 • Published • 4 -
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment
Paper • 2605.19577 • Published • 54 -
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
Paper • 2605.18703 • Published • 46
Agent harness
Continual learning
-
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration
Paper • 2605.20025 • Published • 93 -
OpenComputer: Verifiable Software Worlds for Computer-Use Agents
Paper • 2605.19769 • Published • 55 -
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation
Paper • 2605.10912 • Published • 45 -
EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents
Paper • 2605.13941 • Published • 24
Evaluation
Agent harness
CUA
-
OpenComputer: Verifiable Software Worlds for Computer-Use Agents
Paper • 2605.19769 • Published • 55 -
MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents
Paper • 2605.18652 • Published • 7 -
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
Paper • 2605.12481 • Published • 27
Continual learning
-
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration
Paper • 2605.20025 • Published • 93 -
OpenComputer: Verifiable Software Worlds for Computer-Use Agents
Paper • 2605.19769 • Published • 55 -
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation
Paper • 2605.10912 • Published • 45 -
EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents
Paper • 2605.13941 • Published • 24
Post training
-
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
Paper • 2603.19220 • Published • 69 -
Not Every Rubric Teaches Equally: Policy-Aware Rubric Rewards for RLVR
Paper • 2605.20164 • Published • 4 -
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment
Paper • 2605.19577 • Published • 54 -
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
Paper • 2605.18703 • Published • 46