4:00 pm to 5:00 pm
Rashid Auditorium – 4401 Gates and Hillman Centers
Leveraging Language and Video Demonstrations for Learning Robot Manipulation Skills and Enabling Closed-Loop Task Planning
Humans have gradually developed language, mastered complex motor skills, created and utilized sophisticated tools. The act of conceptualization is fundamental to these abilities because it allows humans to mentally represent, summarize and abstract diverse knowledge and skills. By means of abstraction, concepts that we learn from a limited number of examples can be extended to a potentially infinite set of new and unanticipated situations. Abstract concepts can also be more easily taught to others by demonstration.
I will present work that gives robots the ability to acquire a variety of manipulation concepts that act as mental representations of verbs in a natural language instruction. We propose to use learning from human demonstrations of manipulation actions as recorded in large-scale video data sets that are annotated with natural language instructions. In extensive simulation experiments, we show that the policy learned in the proposed way can perform a large percentage of the 78 different manipulation tasks on which it was trained. We show that this multi-task policy generalizes over variations of the environment. We also show examples of successful generalization over novel but similar instructions.
I will also present work that enables a robot to sequence these newly acquired manipulation skills for long-horizon task planning. Specifically, I will focus on work that uses the same human video demonstrations annotated with natural language to ground symbolic pre- and postconditions of manipulation skills in visual data. I will show how this enables closed-loop task planning involving a large variety of skills, objects and their symbolic states.
I will close this talk by discussing the lessons learned and interesting open questions that still remain.
—
Jeannette Bohg is an Assistant Professor of Computer Science at Stanford University. She was a group leader at the Autonomous Motion Department (AMD) of the MPI for Intelligent Systems until September 2017. Before joining AMD in January 2012, Jeannette Bohg was a PhD student at the Division of Robotics, Perception and Learning (RPL) at KTH in Stockholm. In her thesis, she proposed novel methods towards multi-modal scene understanding for robotic grasping. She also studied at Chalmers in Gothenburg and at the Technical University in Dresden where she received her Master in Art and Technology and her Diploma in Computer Science, respectively.
Her research focuses on perception and learning for autonomous robotic manipulation and grasping. She is specifically interested in developing methods that are goal-directed, real-time and multi-modal such that they can provide meaningful feedback for execution and learning. Jeannette Bohg has received several Early Career and Best Paper awards, most notably the 2019 IEEE Robotics and Automation Society Early Career Award and the 2020 Robotics: Science and Systems Early Career Award.
—
About the Lecture: The Yata Memorial Lecture in Robotics is part of the School of Computer Science Distinguished Lecture Series. Teruko Yata was a postdoctoral fellow in the Robotics Institute from 2000 until her untimely death in 2002. After graduating from the University of Tsukuba, working under the guidance of Prof. Yuta, she came to the United States. At Carnegie Mellon, she served as a post-doctoral fellow in the Robotics Institute for three years, under Chuck Thorpe. Teruko’s accomplishments in the field of ultrasonic sensing were highly regarded and won her the Best Student Paper Award at the International Conference on Robotics and Automation in 1999. It was frequently noted, and we always remember, that “the quality of her work was exceeded only by her kindness and thoughtfulness as a friend.” Join us in paying tribute to our extraordinary colleague and friend through this most unique and exciting lecture.