Plan to Learn: Active Robot Learning by Planning - Robotics Institute Carnegie Mellon University

Plan to Learn: Active Robot Learning by Planning

PhD Thesis, Tech. Report, CMU-RI-TR-24-40, August, 2024

Abstract

Robots hold the promise of becoming an integral part of human life by helping us in our homes, out on farms and in our factories. However, current robots lack the motor skills necessary to perform everyday manipulation tasks, operate outside structured settings and interact with humans. This thesis advocates the principles of active, continual and collaborative learning to allow a robot to autonomously learn the skills necessary to master its domain. We propose a novel Plan to Learn (P2L) framework in which the robot solves a meta planning problem to decide which skills it should learn so that it can achieve its long-term objective while minimizing the cost of data collection. We formalize and study this idea from both a practical and a theoretical lens in two challenging scenarios.

First, we explore how robots can plan to learn as part of a collaborative human-robot team. We develop an optimal mixed integer programming-based planner Act, Delegate, or Learn (ADL) to allocate tasks and decide which skills the robot should learn to reduce its teammate’s workload. We also provide log(n)-approximation algorithms for ADL by showing that it is an instance of the well-known uncapacitated facility location problem. Next, we explore multi-step tasks, such as opening a door, which require several skills to be sequenced. Our first algorithm MetaReasoning for Skill Learning (MetaReSkill) estimates a probabilistic model of skill improvement to identify and prioritize skills that are both easy to learn and most relevant to the over- all task. Finally, we present a hierarchical reinforcement learning formulation to solve the P2L problem for recovery learning. RecoveryChaining learns both where and how to recover by leveraging a hybrid action space consisting of primitive robot actions and nominal options that transfer control to a model-based controller. We demonstrate the effectiveness of our P2L framework on a variety of practically motivated and challenging manipulation tasks both in simulation and in the real world.

This thesis is only a first step towards the ambitious goal of building autonomously learning robots that plan to learn. We sincerely hope that the developed framework and its instantiations on these manipulation tasks will pave the way for further research.

BibTeX

@phdthesis{Vats-2024-142829,
author = {Shivam Vats},
title = {Plan to Learn: Active Robot Learning by Planning},
year = {2024},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-24-40},
keywords = {Robot learning, Skill learning, Active learning, Planning, Manipulation},
}