Abstract
Physics-informed regularization based on Hamilton-Jacobi-Bellman equation with viscosity solution improves value estimation in offline goal-conditioned reinforcement learning through Monte Carlo estimation.
Offline goal-conditioned reinforcement learning (GCRL) learns goal-conditioned policies from static pre-collected datasets. However, accurate value estimation remains a challenge due to the limited coverage of the state-action space. Recent physics-informed approaches have sought to address this by imposing physical and geometric constraints on the value function through regularization defined over first-order partial differential equations (PDEs), such as the Eikonal equation. However, these formulations can often be ill-posed in complex, high-dimensional environments. In this work, we propose a physics-informed regularization derived from the viscosity solution of the Hamilton-Jacobi-Bellman (HJB) equation. By providing a physics-based inductive bias, our approach grounds the learning process in optimal control theory, explicitly regularizing and bounding updates during value iterations. Furthermore, we leverage the Feynman-Kac theorem to recast the PDE solution as an expectation, enabling a tractable Monte Carlo estimation of the objective that avoids numerical instability in higher-order gradients. Experiments demonstrate that our method improves geometric consistency, making it broadly applicable to navigation and high-dimensional, complex manipulation tasks. Open-source codes are available at https://github.com/HrishikeshVish/phys-fk-value-GCRL.
Community
Offline goal-conditioned reinforcement learning (GCRL) learns goalconditioned policies from static pre-collected datasets. However, accurate value
estimation remains a challenge due to the limited coverage of the state-action
space. Recent physics-informed approaches have sought to address this by imposing physical and geometric constraints on the value function through regularization defined over first-order partial differential equations (PDEs), such as
the Eikonal equation. However, these formulations can often be ill-posed in
complex, high-dimensional environments. In this work, we propose a physicsinformed regularization derived from the viscosity solution of the HamiltonJacobi-Bellman (HJB) equation. By providing a physics-based inductive bias,
our approach grounds the learning process in optimal control theory, explicitly
regularizing and bounding updates during value iterations. Furthermore, we leverage the Feynman-Kac theorem to recast the PDE solution as an expectation, enabling a tractable Monte Carlo estimation of the objective that avoids numerical
instability in higher-order gradients. Experiments demonstrate that our method
improves geometric consistency, making it broadly applicable to navigation and
high-dimensional, complex manipulation tasks.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper