Student Talks
Carnegie Mellon University
MSR Thesis Talk: Haochen Wang
Title: Audiovisual ontology and robust representations via cross-modal fusion Abstract: The shrill of an ambulance siren and flashing lights, the hum of an accelerating car — important events often come to us simultaneously through sight and sound. We first consider the problem of identifying these events from raw, unlabeled audiovisual data of agents interacting with [...]
Social Navigation with Pedestrian Groups
Abstract: Autonomous navigation in human crowds (i.e., social navigation) presents several challenges: The robot often needs to rely on its noisy sensors to identify and localize the pedestrians in human crowds; The robot needs plan efficient paths to reach its goals; The robot needs to do so in a safe and socially appropriate manner. In [...]
Carnegie Mellon University
MSR Thesis Talk: Viraj Parimi
Title: T-HTN: Timeline Based HTN Planning for Multi-Agent Robots Abstract: Planning in mission-critical systems like deep-space habitats with onboard robotic systems must be robust to unforeseen circumstances. Such systems are expected to complete a set of goals with different deadlines each day for routine maintenance while also accounting for emergencies. With the presence of [...]
Carnegie Mellon University
Robust and Scalable Perception For Autonomy
Abstract: Autonomous mobile robots have the potential to drastically improve the quality of our daily life. For example, self-driving vehicles could make transportation safer and more affordable. To navigate complex environments, such robots need a perception system that translates raw sensory data to high-level understanding. This thesis focuses on two fundamental challenges in learning such [...]
Carnegie Mellon University
MSR Thesis Talk: Yiming Zuo
Title: Towards Self-supervised Object Discovery and Tracking Abstract: Object discovery and multiple object tracking (MOT) are two highly interrelated tasks that are known to be fundamental problems in computer vision, and are crucial for video understanding. Most existing methods rely on supervised training with human annotations, which is laborious and expensive. In this thesis, [...]
Carnegie Mellon University
MSR Thesis Talk: Qiao Gu
Title: Towards Object-generic 6D Pose Estimation Abstract: Pose estimation is a basic module in many robot manipulation pipelines. Estimating the pose of objects in the environment can be useful for grasping, motion planning, or manipulation. However, current state-of-the-art methods for pose estimation either rely on large annotated training sets or simulated data. Further, the long [...]
Carnegie Mellon University
MSR Thesis Talk: Divam Gupta
Title: End-to-End Deep Stereo Layout Estimation Abstract: Accurate layout estimation is crucial for planning and navigation in robotics applications, such as self-driving. In this paper, we introduce the Stereo Bird's Eye ViewNetwork (SBEVNet), a novel supervised end-to-end framework for estimation of bird's eye view layout from a pair of stereo images. Although our network [...]
Carnegie Mellon University
Michael Tasota – MSR Thesis Talk
Title: Design of a Multimodal System for Social Emotional Learning in Early Childhood Classrooms Abstract: As the prevalence of mobile and touch-based devices continues to expand in society, so too does its impact on young children. With educational technologies also on the rise, young children benefit most from those technologies that are designed to [...]
Carnegie Mellon University
MSR Thesis Talk: Aaron Huang
Title: End-to-End Methods for Autonomous Driving in Simulation Abstract: Fully autonomous driving is considered one of the grand challenges of modern technology and a variety of approaches have emerged for creating and evaluating autonomous driving agents. The self-driving industry typically adopts a modular software architecture and uses large fleets of autonomous vehicles for data [...]
Carnegie Mellon University
MSR Thesis Talk
Title: Retrieval-based Novel Activity Detection in Untrimmed Videos Abstract: Accurately detecting activities in untrimmed videos is a challenging task as systems need to handle variance in object scales, multiple viewpoints, and multiple types of activities. Furthermore, in a real-world scenario, activity detectors are often required to detect novel kinds of activities when the need [...]