Reinforcement Learning for Continuous Stochastic Control Problems
Conference Paper, Proceedings of (NeurIPS) Neural Information Processing Systems, pp. 1029 - 1035, December, 1997
This paper is concerned with the problem of Reinforcement Learning (RL) for continuous state space and time stochastic control problems. We state the Hamilton-Jacobi-Bellman equation satisfied by the value function and use a Finite-Difference method for designing a convergent approximation scheme. Then we propose a RL algorithm based on this scheme and prove its convergence to the optimal solution.
