Balancing Safety and Exploitability in Opponent Modeling

Wang, Z., Boularias, A., Muelling, K., and Peters, J.

Conference Paper, Proceedings of 25th AAAI Conference on Artificial Intelligence (AAAI '11), pp. 1515 - 1520, July, 2011

Abstract

Opponent modeling is a critical mechanism in repeated games. It allows a player to adapt its strategy in order to better respond to the presumed preferences of his opponents. We
introduce a new modeling technique that adaptively balances exploitability and risk reduction. An opponent’s strategy is modeled with a set of possible strategies that contain the actual strategy with a high probability. The algorithm is safe as the expected payoff is above the minimax payoff with a high probability, and can exploit the opponents’ preferences when sufficient observations have been obtained. We apply them to normal-form games and stochastic games with a finite number of stages. The performance of the proposed approach is first demonstrated on repeated rock-paper-scissors games. Subsequently, the approach is evaluated in a humanrobot table-tennis setting where the robot player learns to prepare to return a served ball. By modeling the human players, the robot chooses a forehand, backhand or middle preparation pose before they serve. The learned strategies can exploit the opponent’s preferences, leading to a higher rate of successful returns.

BibTeX

@conference{Wang-2011-107891,
author = {Wang, Z. and Boularias, A. and Muelling, K. and Peters, J.},
title = {Balancing Safety and Exploitability in Opponent Modeling},
booktitle = {Proceedings of 25th AAAI Conference on Artificial Intelligence (AAAI '11)},
year = {2011},
month = {July},
pages = {1515 - 1520},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.