Inferring Non-Stationary Human Preferences for Human-Agent Teams - Robotics Institute Carnegie Mellon University

Inferring Non-Stationary Human Preferences for Human-Agent Teams

Conference Paper, Proceedings of 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN '20), pp. 1178 - 1185, August, 2020

Abstract

One main challenge to robot decision making in human-robot teams involves predicting the intents of a human team member through observations of the human's behavior. Inverse Reinforcement Learning (IRL) is one approach to predicting human intent, however, such approaches typically assume that the human's intent is stationary. Furthermore, there are few approaches that identify when the human's intent changes during observations. Modeling human decision making as a Markov decision process, we address these two limitations by maintaining a belief over the reward parameters of the model (representing the human's preference for tasks or goals), and updating the parameters using IRL estimates from short windows of observations. We posit that a human's preferences can change with time, due to gradual drift of preference and/or discrete, step-wise changes of intent. Our approach maintains an estimate of the human's preferences under such conditions, and is able to identify changes of intent based on the divergence between subsequent belief updates. We demonstrate that our approach can effectively track dynamic reward parameters and identify changes of intent in a simulated environment, and that this approach can be leveraged by a robot team member to improve team performance.

BibTeX

@conference{Hughes-2020-126379,
author = {Dana Hughes and Akshat Agarwal and Yue Guo and Katia Sycara},
title = {Inferring Non-Stationary Human Preferences for Human-Agent Teams},
booktitle = {Proceedings of 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN '20)},
year = {2020},
month = {August},
pages = {1178 - 1185},
}