Online Fitted Reinforcement Learning

Geoffrey Gordon

Workshop Paper, ICML '95 Value Function Approximation in Reinforcement Learning Workshop, July, 1995

View Publication

Abstract

My paper in the main portion of the conference deals with fitted value iteration or Q-learning for offline problems, {em i.e.}, those where we have a model of the environment so that we can examine arbitrary transitions in arbitrary order. The same techniques also allow us to do Q-learning for an online problem, {em i.e.}, one where we have no model but must instead perform experiments inside the MDP to gather data. I will describe how.

BibTeX

@workshop{Gordon-1995-16193,
author = {Geoffrey Gordon},
title = {Online Fitted Reinforcement Learning},
booktitle = {Proceedings of ICML '95 Value Function Approximation in Reinforcement Learning Workshop},
year = {1995},
month = {July},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.