Exploration for Continually Improving Robots - Robotics Institute Carnegie Mellon University
Loading Events

PhD Thesis Proposal

April

1
Mon
Russell Mendonca PhD Student Robotics Institute,
Carnegie Mellon University
Monday, April 1
3:00 pm to 4:30 pm
GHC 8102
Exploration for Continually Improving Robots

Abstract:
General purpose robots should be able to perform arbitrary manipulation tasks, and get better at performing new ones as they obtain more experience. The current paradigm in robot learning involves imitation or simulation. Scaling these approaches to learn from more data for various tasks is bottle-necked by human labor required either in collecting demonstrations or careful design of simulation assets and scenes. Can we instead enable robots to learn how to collect their own data for continual improvement? This thesis seeks to tackle this question of exploration, which directs how agents should act, leading to the discovery of useful behavior.

The first question is how can we define exploration objectives, in the absence of demonstrations or rewards? To explore new goals, our key insight is that it is easier to identify action sequences that lead to some unknown goal state, than to generate the unknown goal directly. This is enabled by training a world model that can be used to measure the uncertainty of action sequences. For further efficiency for real world deployment, we decouple environment and agent-centric exploration. The former relates to incentivizing actions that lead to change in the visual features of objects, and the latter to the robot’s internal world model. The next question we consider is how can we structure the exploration search space for efficient learning? Our approach is to learn data-driven priors, using the abundant multi-task data available in the form of human videos. To utilize these for control, we learn visual affordances, which characterize how objects can be interacted with by hands or end-effectors, providing a very efficient search space for exploration. Further this shared affordance action space can be used to train a joint human-robot world model. The model is first pre-trained on diverse video of human hands performing various tasks, and then fine-tuned with very few robot exploration trajectories for various tasks. This brings us closer to generalist robots that can leverage the commonality between different tasks. For proposed work, we consider how to build mobile autonomous real-world robot systems that can keep exploring and improving. The extended feasible task space and resetting ability of mobile manipulators allows for autonomy, where robots can continually improve with minimal human involvement.

Thesis Committee Members:
Deepak Pathak, Chair
Abhinav Gupta
Ruslan Salakhutdinov
Sergey Levine, UC Berkeley
Dorsa Sadigh, Stanford

More Information