Zoom Meeting Passcode: 841755
Abstract:
The earliest reinforcement learning models were designed to learn one task, specified up-front. However, an agent operating freely in the real world will not in general be granted this luxury, as the demands placed on the agent may change as environments or goals change. We refer to this ever-shifting scenario as the continual, or lifelong, reinforcement learning setting. In this thesis proposal, we address three key challenges posed by environments and tasks that shift over time: 1) how can we frame the continual learning setting (metrics, benchmarks, baselines) to best enable progress? 2) can we create improved modular architectures to improve performance? 3) can we demonstrate improved performance on real-world robotics tasks?
We first present our framework called CORA (COntinual Reinforcement Learning Agents), which defines a common foundation of metrics and benchmarks, to aid the community in developing new algorithms and comparing them to each other. Second, we present a modular method called SANE (Self-Activating Neural Ensembles) designed to allow agents to use only the most relevant behavior to a task, leaving irrelevant behaviors unchanged.
In the work proposed for the remainder of the thesis, we aim to achieve two further goals. The first is to expand our benchmarks into the real world by creating and benchmarking existing methods on a real robot. The second is to generalize our modular framework to handle compositions of different objects with different tasks in a manner that allows for the aggregation of a shared library, and again demonstrate this more general architecture on a real robot.
Thesis Committee Members:
Abhinav Gupta, Chair
Chris Atkeson
Shubham Tulsiani
Tim Rocktäschel, University College London