Carnegie Mellon University
3:00 pm to 4:00 pm
NSH 3305
Abstract:
Finding dynamically feasible and safe global plans for multi-agent teams in real world applications is enormously difficult because the decision branching factor, when considering all possible interactions across agents and an environment, is usually intractable. Humans, however, have great success in the multi-agent planning domain by using behaviors: practiced, coordinated responses for groups of agents that solve objectives specified online by a designated decision maker. In sports for example, players run plays directed by a coach or team captain; in military applications, soldiers execute tactics as directed by a commander.
The execution of objectives in these scenarios hinges on the use of a fast online motion planning strategy for coordinating motions between multiple agents. The concept of behaviors accomplishes this by coordinating agents at a group level, i.e. planning for a tractable number of agents given system time and computational constraints, and by incorporating experience over accumulated examples to improve decision making.
This proposal focuses on the problem of motion planning for multi-agent systems using the concept of behaviors. In this work we formulate behaviors for a multi-robot system and present an online motion planning methodology that enables a human user or high-level planner to direct behaviors for groups of robots online. We additionally incorporate a learning methodology that enables a multi-robot system to leverage experience to augment behaviors for improved performance. Work completed thus far presents simulation and hardware experiments demonstrating a full system for online multi-robot control through behavior specification, illustrated in the context of a human user directing a multi-robot theatrical application. We also present simulation results demonstrating our learning approach for augmenting behavior specification. Proposed work includes methodology to transfer learned motions from one context to new scenarios, as well as perform behaviors without direct user input. We additionally intend to show that the described approach can improve dynamic collision avoidance between both dynamic obstacles and multiple robot groups. Our final proposed contribution is a full hardware implementation of the multi-robot system, with experiments demonstrating learned behavior performance in response to online human command in environments with static and dynamic obstacles.
Thesis Committee Members:
Nathan Michael, Chair
Howie Choset
Maxim Likhachev
Mac Schwager, Stanford University