Heuristic Search Based Planning by Minimizing Anticipated Search Efforts - Robotics Institute Carnegie Mellon University
Loading Events

PhD Thesis Proposal

May

25
Tue
Ishani Chatterjee Robotics Institute,
Carnegie Mellon University
Tuesday, May 25
10:00 am to 11:00 am
Heuristic Search Based Planning by Minimizing Anticipated Search Efforts

Abstract:
Robot planning problems in dynamic environments, such as navigation among pedestrians, driving at high-speed on densely populated roads, and manipulation for collaborative tasks alongside humans, necessitate efficient planning. Bounded-suboptimal heuristic search algorithms are a popular alternative to optimal heuristic search algorithms that compromise solution quality for computation speed. Specifically, these searches aim to find a solution that can be computed faster than the optimal solution, while guaranteeing that its cost is bounded within a specified factor of the optimal cost (bounded-suboptimality). Currently, most bounded-suboptimal heuristic search algorithms speedup planning by guiding the search using cost-to-goal estimates. Cost-to-goal estimates are not necessarily correlated with the search effort required to reach the goal. The key idea of this thesis is to explicitly reason about anticipated search efforts required to reach the goal instead of cost-to-goal estimates, and achieve higher speedup by guiding the search along solutions that minimize these anticipated search efforts while ensuring bounded-suboptimality of the computed solutions. Also, bounded-suboptimal heuristic search algorithms have been largely investigated in the context of deterministic planning. In this thesis, we use the key idea of this thesis to speed up planning while ensuring bounded suboptimality in the context of deterministic as well as probabilistic planning.

To this end, our first contribution is to speed up deterministic robot planning problems formulated as bounded-suboptimal heuristic search. Weighted A* (wA*) search is a popular bounded-suboptimal heuristic search algorithm. We investigate the problem of computing heuristics that explicitly aim to reduce the search efforts of wA*. For heuristic computation, it is common to solve a simpler planning problem in a relaxed space formed by relaxing some constraints in the original search space. We introduce conservative heuristics—novel heuristics that anticipate search efforts required to reach the goal. We first introduce the notion of a conservative path in the relaxed space, whose existence guarantees the existence of a feasible path in the original space. Conservative heuristics are computed such that if a conservative path exists in the relaxed space, then the search can follow the heuristic gradient to find a feasible path in the original space, while expanding only those states that appear on this path. We propose an algorithm to compute conservative heuristics.

We theoretically derive the suboptimality bound and analyze other properties of using conservative heuristics in wA*. Further, we evaluate conservative heuristics using simulated experiments in humanoid footstep planning and planning for a UAV, and a real-world experiment in footstep planning for a NAO robot. Results show significant benefits of conservative heuristics in terms of computation speed, compared to baseline heuristics that estimate cost-to-goal.

Our second contribution is to speed up a particular class of planning under uncertainty problems. Many real-world planning problems involve robots having to plan in partially known environments. This frequently requires planning under uncertainty over missing information about the environment. Unfortunately, the computational expense of such planning often precludes its scalability to real-world problems. The Probabilistic Planning with Clear Preferences (PPCP) framework computes optimal policies in a scalable way for a subset of such planning problems wherein there exist clear preferences over the actual values of missing information. It does so by running a series of fast, deterministic, A*-like searches to construct and refine a policy, guaranteeing optimality under certain conditions. Aligning with the key idea of this thesis, we anticipate II that while running a series of A*-like searches to compute a policy, search efforts are correlated with the amount of missing information each path relies upon to reach the goal. We observe that by exploiting this correlation and minimizing the amount of missing information each path relies upon, marginally suboptimal policies can be computed significantly faster. To that end, we introduce Fast-PPCP, a novel planning algorithm that computes a provably bounded-suboptimal policy using significantly lesser number of searches than that required to find an optimal policy, for the same subset of problems that can be solved by PPCP.

We present the theoretical analysis of Fast-PPCP. We also present experimental analysis in the domain of robot navigation in partially unknown environments, showing significant benefits compared to PPCP and other popular baseline methods. Moreover, to evaluate Fast-PPCP in discounted-reward problems such as RockSample, we also formulate a transformation that converts discounted-reward problems into discounted-cost problems to which Fast-PPCP can be applied.

In the remainder of this thesis, we propose to adapt Fast-PPCP in the context of robot navigation among pedestrians and off-road navigation in partially known terrains. As an optional goal, we plan to demonstrate Fast-PPCP on a SegBot in an off-road terrain. For robot navigation among pedestrians, it is often beneficial to learn specific skills to interact with the world, such as learning how to move in order to encourage pedestrians to clear a passage that they are blocking. It is potentially challenging though to reason how to utilize a learnt skill as an action in the Fast-PPCP framework, and plan efficiently while accounting for the probability of failure in execution of this skill. Furthermore, Fast-PPCP makes a potentially unrealistic assumption of perfect sensing—an assumption that there is no noise in sensing and any sensing operation returns the true state of the variable being sensed. We will make algorithmic modifications in Fast-PPCP to support belief space planning problems that relax this assumption. We will evaluate the modified algorithm in the domain of off-road vehicle navigation in partially known terrains.

More Information

Thesis Committee Members:
Maxim Likhachev, Co-chair
Manuela Veloso, Co-chair
Stephen Smith
Shlomo Zilberstein, University of Massachusetts Amherst