Abstract:
Interactive learning systems like self-driving cars, recommender systems, and large language model chatbots are becoming increasingly ubiquitous in everyday life. From a machine learning perspective, the key technical challenge underlying such systems is that rather than simple prediction on i.i.d. data, an interactive learner influences the distribution of inputs it sees via the choices it has made in the past, dramatically increasing the statistical and computational complexity of the problem. In this thesis, we tackle two challenges in interactive learning, focusing mostly on imitation learning and adjacent paradigms. The first is efficiency. We derive a unifying, game-theoretic framework for imitation learning and provide several efficient reductions to more tractable problems like supervised or online learning. We also consider sample efficiency, both in terms of expert demonstrations and learner-environment interactions and derive minimax-optimal and polynomial-time algorithms, respectively. We then turn our attention to the second challenge: unobserved confounders. When our expert data comes from noisy humans who may observe side-information our system does not, traditional imitation learning methods might pick up on spurious correlations rather than true causal relationships and therefore generalize poorly at test time. We derive algorithms that are robust to two classes of confounders. Our techniques make specific use of the sequential or interactive nature of the problem to extract causal relationships. In the proposed work, we consider how to learn safety constraints from multi-task demonstrations, how to complement a policy with a different observation space, and how to recommend content to users with induced preference shifts.
Thesis Committee Members:
J. Andrew Bagnell, Co-chair
Zhiwei Steven Wu, Co-chair
Geoffrey J. Gordon
Arthur Gretton, University College London
This event has passed.
PhD Thesis Proposal
May
10
Wed
Efficient Interactive Learning with Unobserved Confounders