Using Finite-Differences methods for approximating the value function of continuous Reinforcement Learning problems - Robotics Institute Carnegie Mellon University

Using Finite-Differences methods for approximating the value function of continuous Reinforcement Learning problems

Remi Munos
Conference Paper, Proceedings of International Symposium on Multi-Technology Information Processing (ISMIP '96), December, 1996

Abstract

This paper presents a reinforcement learning method for solving continuous optimal control problems when the dynamics of the system is unknown. First, we use a Finite Differences method for discretizing the Hamilton-Jacobi-Bellman equation and obtain a finite Markovian Decision Process. This permits us to approximate the value function of the continuous problem with piecewise constant functions defined on a grid. Then we propose to solve this MDP on-line with the available knowledge using a direct and convergent reinforcement learning algorithm, called the Finite-Differences Reinforcement Learning

BibTeX

@conference{Munos-1996-16323,
author = {Remi Munos},
title = {Using Finite-Differences methods for approximating the value function of continuous Reinforcement Learning problems},
booktitle = {Proceedings of International Symposium on Multi-Technology Information Processing (ISMIP '96)},
year = {1996},
month = {December},
}