See, Hear, Explore: Curiosity via Audio-Visual Association

Victoria Dean, Shubham Tulsiani, and Abhinav Gupta

Conference Paper, Proceedings of (NeurIPS) Neural Information Processing Systems, December, 2020

Abstract

Exploration is one of the core challenges in reinforcement learning. A common formulation of curiosity-driven exploration uses the difference between the real future and the future predicted by a learned model. However, predicting the future is an inherently difficult task which can be ill-posed in the face of stochasticity. In this paper, we introduce an alternative form of curiosity that rewards novel associations between different senses. Our approach exploits multiple modalities to provide a stronger signal for more efficient exploration. Our method is inspired by the fact that, for humans, both sight and sound play a critical role in exploration. We present results on several Atari environments and Habitat (a photorealistic navigation simulator), showing the benefits of using an audio-visual association model for intrinsically guiding learning agents in the absence of external rewards. For videos and code, see
https://vdean.github.io/audio-curiosity.html

BibTeX

@conference{Dean-2020-127084,
author = {Victoria Dean and Shubham Tulsiani and Abhinav Gupta},
title = {See, Hear, Explore: Curiosity via Audio-Visual Association},
booktitle = {Proceedings of (NeurIPS) Neural Information Processing Systems},
year = {2020},
month = {December},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.