Abstract:
Secondly, the presence of occlusions poses significant challenges in object understanding. For example, objects in the scene may be partially occluded by other static or dynamic objects, truncated by the camera’s field of view, or be self-occluded, i.e., the camera facing side of the object is occluded by the opposing side of the object. We present a holistic approach to handle such occlusions for amodal 3D shape reconstruction. The approach starts by learning occlusion categories with human supervision. Then, these learned categories are exploited in a novel framework that uses a mixed representation (keypoints, segmentations and shape basis) for objects to automatically generate a large physically realistic dataset of occlusions using freely available time-lapse imagery from traffic cameras. This dataset provides strong 2D and 3D self-supervision to a network that jointly learns amodal 2D keypoints and segmentations, which are then optimized to reconstruct 3D shapes under constraints provided by occlusion categories. Our system demonstrates significant improvements in amodal 3D reconstruction of heavily occluded objects captured at any time of the day from traffic, hand-held, and in-vehicle cameras, thus enhancing the potential of smart cities to utilize outdoor cameras for effective urban planning.
Committee:
Prof. Srinivasa G. Narasimhan (advisor)
Prof. Shubham Tulsiani
Yufei Ye
Time: 12:30 PM to 2:00 PM (ET)
Location: GHC 8102
Passcode: 843349