Strategy assessment for solving rich physical problems
Abstract
Every day, humans face challenges that require ingenious solutions and making do with limited resources. From opening a stuck jar to re-floating a sunken ship to improvising a cooling system or an air/water filtration system, the examples abound. Often, coming up with a good strategy calls upon an understanding of a rich set of physics that spans multiple domains. While humans excel at solving such problems, this remains a challenge for robots, a reason of which is arguably a lack of a rich physics engine that can be quickly and flexibly instantiated, as well as the combinatorial complexity of identifying what physical concepts are relevant to a particular situation. This work constitutes an exploratory investigation towards palliating this.
In particular, we propose an "intuitive physics reasoner" to assess strategies that are proposed to solve a given problem. We focus on problems where the physics take a precedence over geometric considerations. We believe the ability to quickly determine whether a strategy is worth considering and allocating further resources to planning using that strategy is useful.
As a major design consideration, we required that the input to our system (i.e., a strategy) be as simple and convenient as possible. This is why we use strategies expressed in natural language as the input to our system. This not only allows humans to easily provide candidate strategies, but also makes for a convenient bridge with the recent developments in generative AI, notably large language models (LLMs).
We build a physics knowledge library, with an explicit encoding of knowledge, such that it facilitates targeted knowledge retrieval and the building of knowledge graphs with the physical concepts that are relevant to a given situation. Such graphs form the substrate for subsequent reasoning steps, such as forward simulations and parameter optimization.
On the other hand, we leverage LLMs' versatility to translate a strategy described in natural language into a computationally-usable form, which is a graph, similar to the ones described above.
The reasoner can then verify that a strategy is consistent with its prior knowledge of physics, and that inputs, states, and outputs are within set limits. It is also capable of finding optimal parameters according to a given criterion.
We construe the verification of the validity of the physics (at an abstract level) as a graph matching or correspondence problem. Then, a numerical reasoning step leverages the graph structure to propagate information (forward for simulating, or backward when computing Jacobians), and verify constraints satisfaction.
This ultimately allows the reasoner to assess the feasibility of a strategy, both on an abstract physics front and numerically, with the option of finding optimal quantities when not specified by the natural language strategy.
We demonstrate the system's capability on a number of problems involving a diverse set of physical and chemical concepts, with favorable outcomes in terms of assessment accuracy.
BibTeX
@mastersthesis{Boussema-2023-137885,author = {Chiheb Boussema},
title = {Strategy assessment for solving rich physical problems},
year = {2023},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-23-68},
}