Multi-armed bandit algorithms for spare time planning of a mobile service robot - Robotics Institute Carnegie Mellon University

Multi-armed bandit algorithms for spare time planning of a mobile service robot

Max Korein and Manuela Veloso
Conference Paper, Proceedings of 17th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS '18), pp. 2195 - 2197, July, 2018

Abstract

We assume that service robots will have spare time in between scheduled user requests, which they could use to perform additional unrequested services in order to learn a model of users' preferences and receive reward. However, a mobile service robot is constrained by the need to travel through the environment to reach a user in order to perform a service for them, as well as the need to carry out scheduled user requests. We present modified versions of Thompson Sampling and UCB1, existing algorithms used in multi-armed bandit problems, which plan ahead considering the time and location constraints of a mobile service robot. We compare them to existing versions of Thompson Sampling and UCB1 and find that our modified planning algorithms outperform the original versions in terms of both reward received and the effectiveness of the model learned in a simulation.

BibTeX

@conference{Korein-2018-122718,
author = {Max Korein and Manuela Veloso},
title = {Multi-armed bandit algorithms for spare time planning of a mobile service robot},
booktitle = {Proceedings of 17th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS '18)},
year = {2018},
month = {July},
pages = {2195 - 2197},
}