11:00 am to 12:00 am
Event Location: NSH 1305
Abstract: In many domains, teams of hundreds of agents must cooperatively plan to perform tasks in a complex, uncertain environment. This requires that each agent take into account teammates’ states, observations, and actions when making decisions about its own actions. Naively, finding a good policy involves searching this large joint space, but this quickly becomes intractable as it grows more than exponentially with additional agents. However, even when acting cooperatively, agents are often unaffected by most of what other agents are doing. An awareness of this structure can greatly alleviate the need to explore the full joint policy space. Specifically, some problems contain a property of static sparsity, in which the effects of individual agents can be modeled nearly independently but have some “interactions” in which they share joint transition, reward, or observation functions with one or more teammates. Problems can also contain dynamic sparsity, in which, even with a large number of possible interactions between agents, the number of interactions that actually occur in any particular solution instance is small.
This thesis focuses on exploiting these two types of problem structure, along with distributed computation, to greatly improve planning efficiency. Static sparsity allows agent models to be compactly represented as a set of independent models and a set of partially-joint models that override these independent models. Dynamic sparsity allows planners to disregard irrelevant combinations of interactions by dynamically handling only those that arise during the planning process. Moreover, in the case of intelligent agents, computational power itself is often distributed across the team, with each agent providing a unit of computation. Thus, distributed planning approaches have access to computational resources that grow linearly with team size, making it easier to scale to very large teams.
Taking advantage of these properties, we propose RDPCL and DIMS. RDPCL is a Dec-POMDP problem subclass that directly encodes sparsity, while DIMS is a planning framework in which agents plan iteratively and concurrently over independent local models which are then shaped by the expected observations, movements and rewards of their teammates. We present several implementations of the DIMS framework, D-TREMOR, DPP, and MILP-TREMOR. D-TREMOR is DIMS applied to the DPCL subclass of Dec-POMDPs, while DPP is DIMS applied to multiagent path planning and MILP-TREMOR is DIMS encoded as an optimization problem. Experiments are conducted in a simplified urban search and rescue and a convoy planning domain. These experiments explore the strengths and weaknesses of the approach, demonstrating effective 100-agent Dec-POMDP planning in under an hour for problems that contain the aforementioned structure, while highlighting performance limitations under specific conditions.
Committee:Katia Sycara, Co-chair
Paul Scerri, Co-chair
J. Andrew Bagnell
Edmund H. Durfee, University of Michigan
Pradeep Varakantham, Singapore Management University