June 21, 2024    Mallory Lindahl

A research group at Carnegie Mellon University’s Robotics Institute will soon host the CMU Vision-Language-Autonomy Challenge, bringing researchers together at the intersection of computer vision, natural language understanding, and navigation autonomy.

The challenge aims to progress computer vision and AI-research in real-world systems. The team has created an award-winning navigation autonomy system over the last decade, with every piece  developed from scratch, including state estimation, collision avoidance, and path planning.

“We developed the system not just for ourselves,” said Ji Zhang, faculty member at the Robotics Institute. “The computer vision and AI communities are doing very advanced work. We have seen a strong need for reliable, easy-to-use autonomy systems to support their work. They work at a high level. We support them from the bottom.”

The robot platform provided has a 3D scanning LiDAR (light detection and ranging) sensing, a 360 degree camera, and an onboard base autonomy system. Participants will be asked to develop an AI module that can process the lidar and camera data, respond to a set of language inputs by understanding the scene information, and interface with the base autonomy system to guide the navigation. This challenge is both practical and meaningful, allowing participants to address real-world issues through their work.

The team also provides several resources to help the teams prepare for the challenge, including two open-sources simulation systems based on Unity scenes and AI Habitat + Matterport3D photorealistic scenes, datasets from the real robot, and a novel large-scale object-referential language dataset containing 6.2K scenes and 7.5M+ statements.

“In the end, we are helping the communities realize Embodied AI on real robots, moving their studies from datasets and simulations to real-world deployments,” said Wenshan Wang, faculty member at the Robotics Institute.

The challenge will begin as simulation-only in 2024 and will move on to real robot deployment starting in 2025. A workshop will be held this October at IROS 2024 to showcase results and introduce the real robot challenge for the following year. The workshops will also include exciting talks from leading researchers in computer vision, natural language understanding, and robotics.

For more information, visit the workshop and challenge website.

For more information: Aaron Aupperlee | 412-268-9068 | aaupperlee@cmu.edu