Both Elements Are Critical for Applications Such As Self-Driving Cars
PITTSBURGH—Researchers at Carnegie Mellon University have developed a new metric for evaluating how well self-driving cars respond to changing road conditions and traffic, making it possible for the first time to compare perception systems for both accuracy and reaction time.
Mengtian Li, a Ph.D. student in CMU’s Robotics Institute, said academic researchers tend to develop sophisticated algorithms that can accurately identify hazards, but may demand a lot of computation time. Industry engineers, by contrast, tend to prefer simple, less accurate algorithms that are fast and require less computation, so the vehicle can respond to hazards more quickly.
This tradeoff is a problem not only for self-driving cars, but also for any system that requires real-time perception of a dynamic world, such as autonomous drones and augmented reality systems. Yet until now, there’s been no systematic measure that balances accuracy and latency — the delay between when an event occurs and when the perception system recognizes that event. This lack of an appropriate metric as made it difficult to compare competing systems.
The new metric, called streaming perception accuracy, was developed by Li, together with Deva Ramanan, associate professor in the Robotics Institute and principal scientist at Argo AI, and Yu-Xiong Wang, assistant professor at the University of Illinois at Urbana-Champaign. They presented it last month at the virtual European Conference on Computer Vision, where it received a best paper honorable mention award.
Streaming perception accuracy is measured by comparing the output of the perception system at each moment with the ground truth state-of-the-world.
“By the time you’ve finished processing inputs from sensors, the world has already changed,” Li explained, noting that the car has traveled some distance while the processing occurs.
“The ability to measure streaming perception offers a new perspective on existing perception systems,” Ramanan said. Systems that perform well according to classic measures of performance may perform quite poorly on streaming perception. Optimizing such systems using the newly introduced metric can make them far more reactive.
One insight from the team’s research is that the solution isn’t necessarily for the perception system to run faster, but to occasionally take a well-timed pause. Skipping the processing of some frames prevents the system from falling farther and farther behind real-time events, Ramanan added.
Another insight is to add forecasting methods to the perception processing. Just as a batter in baseball swings at where they think the ball is going to be — not where it is — a vehicle can anticipate some movements by other vehicles and pedestrians. The team’s streaming perception measurements showed that the extra computation necessary for making these forecasts doesn’t significantly harm accuracy or latency.
The CMU Argo AI Center for Autonomous Vehicle Research, directed by Ramanan, supported this research, as did the Defense Advanced Research Projects Agency.