VASC Seminar
Attentive Human Action Recognition
Abstract: Enabling computers to recognize human actions in video has the potential to revolutionize many areas that benefit society such as clinical diagnosis, human-computer interaction, and social robotics. Human action recognition, however, is tremendously challenging for computers due to the subtlety of human actions and the complexity of video data. Critical to the success of [...]
Temporal Modeling and Data Synthesis for Visual Understanding
Abstract: In this talk, I will present two recent pieces of work on leveraging temporal information and synthetic data to enhance video and image understanding. In the first part, I will introduce a progressive learning framework, Spatio-TEmporalProgressive (STEP), for action detection in videos. STEP is able to more effectively make use of longer temporal information, [...]
VR facial animation via multiview image translation
Abstract: A key promise of Virtual Reality (VR) is the possibility of remote social interaction that is more immersive than any prior telecommunication media. However, existing social VR experiences are mediated by inauthentic digital representations of the user (i.e., stylized avatars). These stylized representations have limited the adoption of social VR applications in precisely those [...]
Neural Volumes: Learning Dynamic Renderable Volumes from Images
Abstract: Modeling and rendering of dynamic scenes is challenging, as natural scenes often contain complex phenomena such as thin structures, evolving topology, translucency, scattering, occlusion, and biological motion. Mesh-based reconstruction and tracking often fail in these cases, and other approaches (e.g., light field video) typically rely on constrained viewing conditions, which limit interactivity. We [...]
Towards Lightweight Real-time Hand Reconstruction in Challenging
Abstract: Humans naturally use their hands to interact and communicate with their surroundings. Reconstructing these complex and dexterous hand interactions enables sign-language recognition and translation, better assistive robots, and more immersive human-computer interaction (e.g. for AR and VR). To make hand reconstruction usable for the aforementioned applications and to a wide set of users, the [...]
Hybrid Methods for the Integration of Heterogeneous Multimodal Biomedical Data
Abstract: The prevalence of smartphones and wearable devices for health monitoring and widespread use of electronic health records have led to a surge in heterogeneous multimodal healthcare data, collected at an unprecedented scale. My research focuses on developing machine learning techniques that learn salient representations of multimodal, heterogeneous data for biomedical predictive models. The first [...]
Self-Driving Cars & AI: Transforming our Cities and our Lives
Abstract: Recent algorithmic and hardware improvements resulted in several success stories in the field of Artificial Intelligence (AI) which impact our daily lives. However, despite its ubiquity, AI is only just starting to make advances in what may arguably have the largest societal impact thus far, the nascent field of autonomous driving. At Uber ATG, [...]
Go, fastMRI, and Minecraft: Exploring the limits of AI
Abstract: The application of AI across various domains demonstrates both the promise of existing techniques but also their limitations. In this talk, I explore three recent projects and how they shed light on the progress of AI and the challenges to come. These projects include ELF OpenGo a reimplementation of AlphaZero, fastMRI for reducing the time [...]
Towards Weakly-Supervised Visual Understanding
Abstract: Learning with weak and self-supervisions recently emerged as compelling tools towards leveraging vast amounts of unlabeled or partially-labeled data. In this talk, I will present some of the latest advances in weakly-supervised visual scene understanding from NVIDIA. Specifically, I will summarize and discuss some challenges and potential solutions in weakly-supervised learning, and introduce our [...]
Imaging without focusing: A computational approach to miniaturizing cameras
Abstract: Miniaturization of cameras is key to enabling new applications in areas such as connected devices, wearables, implantable medical devices, in vivo microscopy, and micro-robotics. Recently, lenses were identified as the main bottleneck in miniaturization of cameras. Standard smaller lens-system camera modules have a thickness of about 10 mm or higher, and reducing the size [...]