Using Finite-Differences methods for approximating the value function of continuous Reinforcement Learning problems

Remi Munos

Conference Paper, Proceedings of International Symposium on Multi-Technology Information Processing (ISMIP '96), December, 1996

View Publication

Abstract

This paper presents a reinforcement learning method for solving continuous optimal control problems when the dynamics of the system is unknown. First, we use a Finite Differences method for discretizing the Hamilton-Jacobi-Bellman equation and obtain a finite Markovian Decision Process. This permits us to approximate the value function of the continuous problem with piecewise constant functions defined on a grid. Then we propose to solve this MDP on-line with the available knowledge using a direct and convergent reinforcement learning algorithm, called the Finite-Differences Reinforcement Learning

BibTeX

@conference{Munos-1996-16323,
author = {Remi Munos},
title = {Using Finite-Differences methods for approximating the value function of continuous Reinforcement Learning problems},
booktitle = {Proceedings of International Symposium on Multi-Technology Information Processing (ISMIP '96)},
year = {1996},
month = {December},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.