Moving Lights and Cameras for Better 3D Perception of Indoor Scenes
Abstract
Decades of research on computer vision have highlighted the importance of active sensing -- where an agent controls the parameters of the sensors to improve perception. Research on active perception in the context of robotic manipulation has demonstrated many novel and robust sensing strategies involving a multitude of sensors like RGB and RGBD cameras and a variety of tactile, proximity, and spectroscopic sensors resulting in ever-improving representations and understanding of the world around the agent. In this work we explore sensor configurations, sensor positioning, and robot workspace illumination to improve 3D perception of objects in table-top scaled scenes. We divide the problem into three parts to explore the effects of moving camera-based sensors in a robot's workspace, moving illumination sources around a robotic workspace and controlling both illumination and camera movement. We show that a robot mounted ensemble of camera-based sensors (namely RGB, RGBD and tactile) can help visually servo a manipulator and accurately localize contacts on objects using vision and touch. We show that known directional illumination can be very effective for measuring objects in the workspace accurately. We demonstrate this with a robot workspace scaled highly accurate photometric stereo stage. Finally, we show that multi-view, multi-illumination images captured using a custom multi-flash camera system can be effective in reconstructing, synthesizing novel views of, and relighting table-top scaled scenes. In addition to the capture system, we also demonstrate efficient algorithms and representations for 3D perception of small scenes.
BibTeX
@phdthesis{Chaudhury-2024-143995,author = {Arkadeep Narayan Chaudhury},
title = {Moving Lights and Cameras for Better 3D Perception of Indoor Scenes},
year = {2024},
month = {October},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-24-71},
keywords = {Perception systems, tactile sensing, photometric stereo, neural 3D representations},
}