Performing Self-Scheduled Services in the Spare Time of a Mobile Autonomous Service Robot - Robotics Institute Carnegie Mellon University
Loading Events

PhD Thesis Proposal

December

9
Wed
Max Korein Carnegie Mellon University
Wednesday, December 9
2:00 pm to 12:00 am
Performing Self-Scheduled Services in the Spare Time of a Mobile Autonomous Service Robot

Event Location: NSH 3305

Abstract: Mobile autonomous service robots perform services requested by users at specific times. The goal of this thesis is to explore how a service robot can make use of the spare time between user requests. We propose the robot performs self-scheduled services for which it receives reward from the users. Our proposed work consists of: the robot learning a model of the reward function during execution; the robot planning the self-scheduled tasks while constrained by user requests; and the robot executing the plan.

Much of the existing work on scheduling services for mobile autonomous service robots has focused on optimal scheduling of user requests or information-gathering tasks to be performed as quickly as possible, rather than making the most use out of the spare time between requests. The problem of maximizing reward gathered in a given amount of time with given start and end locations resembles a planning problem called the orienteering problem. However, the variants of the orienteering problem that have been studied do not include the need for the robot to learn the reward functions, and do not cover the full extent of possible situations a service robot may need to address.

We have created the concept of neighborhood-based planning, in which an agent creates plans over different subsets of the environment and chooses the best plan, and implemented the concept in the NBPlan algorithm. We used a simulation that we developed in order to test a greedy variant of the NBPlan in environments with varying topological structures and reward distributions, and found its performance superior to a naive greedy algorithm, particularly in large environments or ones with locations that have much higher reward than other nearby nodes. The reward functions in the simulation are determined by simulated preferences and schedules of a user at each location. We have shown that the robot can learn the reward functions from sparser data by taking advantage of the factorable nature of this reward function, as well as by using additional observations such as inferring users’ schedules based on observations of when their doors are open or closed.

We propose to continue this work by integrating the planning, execution, and learning components of this problem into one cohesive system. The learning component of the system will propose subsets of the environment that it believes may contain the optimal plan. The planning algorithm will find a plan to maximize the reward received, while also gathering further observations that improve the robot’s model of the environment. We will perform a comprehensive extensive evaluation of the complete system in a variety of simulated environments, measuring the total reward the robot achieves over repeated iterations as it gathers more data and learns about the environment. Finally, we will demonstrate the algorithm running on a real CoBot robot in an academic office building.

The expected contributions of our research will be a complete system for planning, executing, and learning for self-generated goals constrained by user requests for a mobile service robot. This includes an implementation and demonstration of the concept of neighborhood-based planning, and a system for learning an accurate model of the robot’s environment from the observations it makes during execution.

Committee:Manuela Veloso, Chair
Reid Simmons
Illah Nourbakhsh
Peter Stone, The University of Texas at Austin