VASC Seminar
Point Cloud Registration with or without Learning
Abstract: I will be presenting two of our recent works on 3D point cloud registration: A scene flow method for non-rigid registration: I will discuss our current method to recover scene flow from point clouds. Scene flow is the three-dimensional (3D) motion field of a scene, and it provides information about the spatial arrangement [...]
Propelling Robot Manipulation of Unknown Objects using Learned Object Centric Models
Abstract: There is a growing interest in using data-driven methods to scale up manipulation capabilities of robots for handling a large variety of objects. Many of these methods are oblivious to the notion of objects and they learn monolithic policies from the whole scene in image space. As a result, they don’t generalize well to [...]
When and Why Does Contrastive Learning Work?
Abstract: Contrastive learning organizes data by pulling together related items and pushing apart everything else. These methods have become very popular but it's still not entirely clear when and why they work. I will share two ideas from our recent work. First, I will argue that contrastive learning is really about learning to forget. Different [...]
Anticipating the Future: forecasting the dynamics in multiple levels of abstraction
Abstract: A key navigational capability for autonomous agents is to predict the future locations, actions, and behaviors of other agents in the environment. This is particularly crucial for safety in the realm of autonomous vehicles and robots. However, many current approaches to navigation and control assume perfect perception and knowledge of the environment, even though [...]
Learning to Perceive Videos for Embodiment
Abstract: Video understanding has achieved tremendous success in computer vision tasks, such as action recognition, visual tracking, and visual representation learning. Recently, this success has gradually been converted into facilitating robots and embodied agents to interact with the environments. In this talk, I am going to introduce our recent efforts on extracting self-supervisory signals and [...]
Open Challenges in Sign Language Translation & Production
Abstract: Machine translation and computer vision have greatly benefited of the advances in deep learning. The large and diverse amount of textual and visual data have been used to train neural networks whether in a supervised or self-supervised manner. Nevertheless, the convergence of the two field in sign language translation and production is still poses [...]
3D Recognition with self-supervised learning and generic architectures
Abstract: Supervised learning relies on manual labeling which scales poorly with the number of tasks and data. Manual labeling is especially cumbersome for 3D recognition tasks such as detection and segmentation and thus most 3D datasets are surprisingly small compared to image or video datasets. 3D recognition methods are also fragmented based on the type [...]
Rapid Adaptation for Robot Learning
Abstract: How can we train a robot to generalize to diverse environments? This question underscores the holy grail of robot learning research because it is difficult to supervise an agent for all possible situations it can encounter in the future. We posit that the only way to guarantee such a generalization is to continually learn and [...]
Humans, hands, and horses: 3D reconstruction of articulated object categories using strong, weak, and self-supervision
Abstract: Reconstructing 3D objects from a single 2D image is a task that humans perform effortlessly, yet computer vision so far has only robustly solved 3D face reconstruction. In this talk we will see how we can extend the scope of monocular 3D reconstruction to more challenging, articulated categories such as human bodies, hands and [...]
Looking behind the Seen in Order to Anticipate
Abstract: Despite significant recent progress in computer vision and machine learning, personalized autonomous agents often still don’t participate robustly and safely across tasks in our environment. We think this is largely because they lack an ability to anticipate, which in turn is due to a missing understanding about what is happening behind the seen, i.e., [...]
The Clinician’s AI Partner: Augmenting Clinician Capabilities Across the Spectrum of Healthcare
Abstract: Clinicians often work under highly demanding conditions to deliver complex care to patients. As our aging population grows and care becomes increasingly complex, physicians and nurses are now also experiencing feelings of burnout at unprecedented levels. In this talk, I will discuss possibilities for computer vision to function as a partner to clinicians, and to augment their capabilities, across [...]
Reliable and Accessible Visual Recognition
Abstract: As visual recognition models are developed across diverse applications; we need the ability to reliably deploy our systems in a variety of environments. At the same time, visual models tend to be trained and evaluated on a static set of curated and annotated data which only represents a subset of the world. In this [...]
Fake It Till You Make It: Face analysis in the wild using synthetic data alone
Abstract: In this seminar I will demonstrate how synthetic data alone can be used to perform face-related computer vision in the wild. The community has long enjoyed the benefits of synthesizing training data with graphics, but the domain gap between real and synthetic data has remained a problem, especially for human faces. Researchers have tried [...]
Leveraging StyleGAN for Image Editing and Manipulation
Abstract: StyleGAN has recently been established as the state-of-the-art unconditional generator, synthesizing images of phenomenal realism and fidelity, particularly for human faces. With its rich semantic space, many works have attempted to understand and control StyleGAN’s latent representations with the goal of performing image manipulations. To perform manipulations on real images, however, one must learn to [...]
Next-Gen Video Communication
Abstract: Video communication connects our world. It is necessary in conducting business, educational and personal activities across different geographical locations. However, the quality of an average user’s video communication is dramatically worse than that of professionally created videos in news broadcasts, talk shows, and on YouTube. This is because professionally created videos are often captured with [...]