Mitigating Causal Confusion in Driving Agents via Gaze Supervision
Abstract
Imitation Learning (IL) algorithms such as behavior cloning (BC) do not explicitly encode the underlying causal structure of the task being learnt. This often leads to mis-attribution about the relative importance of scene elements towards the occurrence of a corresponding action, a phenomenon termed Causal Confusion or Causal Misattribution. Causal confusion is made worse in highly complex scenarios such as urban driving where the agent has access to a large amount of information per time-step (visual data, sensor data, odometry, etc.).Our key idea is that while driving, human drivers naturally exhibit an easily obtained, continuous signal that is highly correlated with causal elements of the state space: eye gaze. We collect human driver demonstrations in a CARLA-based VR driving simulator, DReyeVR, allowing us to capture eye gaze in the same simulation environment as other training data commonly used in prior work. Further, we propose a contrastive-learning method to use gaze-based supervision to mitigate causal confusion in urban driving IL agents - exploiting the relative importance of gazed-at and not-gazed-at scene elements for driving decision making. We present preliminary quantitative results that suggest the promise of gaze-based supervision in improving the driving performance of IL agents.
BibTeX
@workshop{Biswas-2022-134349,author = {Abhijat Biswas, Badal Arun Pardhi, Caleb Chuck, Jarrett Holtz, Scott Niekum, Henny Admoni, Alessandro Allievi},
title = {Mitigating Causal Confusion in Driving Agents via Gaze Supervision},
booktitle = {Proceedings of Aligning Robot Representations with Humans, 6th Conference on Robot Learning},
year = {2022},
month = {December},
}