Papers
arxiv:2604.19695

Planning in entropy-regularized Markov decision processes and games

Published on Apr 21
Authors:
,
,
,
,

Abstract

SmoothCruiser is a planning algorithm that estimates value functions in regularized MDPs and two-player games using environment generative models, achieving improved sample complexity through Bellman operator smoothness.

AI-generated summary

We propose SmoothCruiser, a new planning algorithm for estimating the value function in entropy-regularized Markov decision processes and two-player games, given a generative model of the environment. SmoothCruiser makes use of the smoothness of the Bellman operator promoted by the regularization to achieve problem-independent sample complexity of order O~(1/epsilon^4) for a desired accuracy epsilon, whereas for non-regularized settings there are no known algorithms with guaranteed polynomial sample complexity in the worst case.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.19695
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.19695 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.19695 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.