Some New Designs of Convolutional and Recurrent Networks
Abstract: Convolutional networks (CNNs) and recurrent networks have driven the great engineering success of deep learning in recent years. However, as academics, we still wonder whether they are indeed the ultimate models of choice. Especially, CNNs seem unable to characterize predictive uncertainty, and they are highly dependent on small filters on small, rectangular neighborhoods. On [...]
Improving Multi-fingered Robot Manipulation by Unifying Learning and Planning
Abstract: Multi-fingered hands offer autonomous robots increased dexterity, versatility, and stability over simple two-fingered grippers. Naturally, this increased ability comes with increased complexity in planning and executing manipulation actions. As such, I propose combining model-based planning with learned components to improve over purely data-driven or purely-model based approaches to manipulation. This talk examines multi-fingered autonomous [...]
Language and Interaction in Minecraft
Abstract: I will discuss a research program aimed at building a Minecraft assistant, in order to facilitate the study of agents that can complete tasks specified by dialogue, and eventually, to learn from dialogue interactions. I will describe the tools and platform we have built allowing players to interact with the agents and to record those interactions, and [...]
Carnegie Mellon University
Scaling Up Deep Learning with Model and Algorithm Awareness
Abstract: In recent years, the pace of innovations in the fields of deep learning has accelerated. To cope with the sheer computational complexity of training large ML models on large datasets, researchers in the systems and ML communities have created software systems that parallelize training algorithms over multiple CPUs or GPUs (multi-device parallelism), or even [...]
Design, Modeling and Control of a Robot Bat: From Bio-inspiration to Engineering Solutions
Abstract: In this talk, I will describe our recent work building a biologically-inspired bat robot. Bats have a complex skeletal morphology, with both ball-and-socket and revolute joints that interconnect the bones and muscles to create a musculoskeletal system with over 40 degrees of freedom, some of which are passive. Replicating this biological system in a [...]
Attentive Human Action Recognition
Abstract: Enabling computers to recognize human actions in video has the potential to revolutionize many areas that benefit society such as clinical diagnosis, human-computer interaction, and social robotics. Human action recognition, however, is tremendously challenging for computers due to the subtlety of human actions and the complexity of video data. Critical to the success of [...]
Carnegie Mellon University
Underwater Localization and Mapping with Imaging Sonar
Abstract: Acoustic imaging sonars have been used for a variety of tasks intended to increase the autonomous capabilities of underwater vehicles. Among the most critical tasks of any autonomous vehicle are localization and mapping, which are the focus of this work. The difficulties presented by the imaging sonar sensor have led many previous attempts at [...]
Deep Learning for Robotics
Abstract: Programming robots remains notoriously difficult. Equipping robots with the ability to learn would by-pass the need for what otherwise often ends up being time-consuming task specific programming. This talk will describe recent progress in deep reinforcement learning (robots learning through their own trial and error), in apprenticeship learning (robots learning from observing people), and [...]
Temporal Modeling and Data Synthesis for Visual Understanding
Abstract: In this talk, I will present two recent pieces of work on leveraging temporal information and synthetic data to enhance video and image understanding. In the first part, I will introduce a progressive learning framework, Spatio-TEmporalProgressive (STEP), for action detection in videos. STEP is able to more effectively make use of longer temporal information, [...]
Multiple Drone Vision and Cinematography
Abstract: The aim of drone cinematography is to develop innovative intelligent single- and multiple-drone platforms for media production to cover outdoor events (e.g., sports) that are typically distributed over large expanses, ranging, for example, from a stadium to an entire city. The drone or drone team, to be managed by the production director and his/her [...]
Modeling, Design, and Analysis for Intelligent Vehicles: Intersection Management, Security-Aware Design, and Automotive Design Automation
Abstract: Advanced Driver Assistance Systems (ADAS), autonomous functions, and connected applications bring a revolution to automotive systems and software. In this talk, several research topics in the domain of automotive systems and software will be introduced: (1) graph-based modeling, scheduling, and verification for intersection management, (2) security-aware design and analysis considering timing, game theory, and [...]
Carnegie Mellon University
Open-world Object Detection and Tracking
Abstract: Computer vision today excels at recognition in narrow slices of the real world. Our systems seem to accurately detect cats, cars, or chairs, but largely ignore the vast diversity of objects in the world that are absent from our training datasets. Perception in the open world, however, requires detecting and tracking any object, regardless [...]
Carnegie Mellon University
Personalized and weakly supervised learning for Parkinson’s disease symptom detection
Abstract: Parkinson's Disease (PD) is a neurodegenerative disorder that affects approximately one million Americans. Medications exist to manage the symptoms, but doctors must periodically adjust dosage level and frequency as a patient's disease progresses. These adjustments are typically based on observations made during short clinic visits, which provide an incomplete picture of a patient's daily [...]
VR facial animation via multiview image translation
Abstract: A key promise of Virtual Reality (VR) is the possibility of remote social interaction that is more immersive than any prior telecommunication media. However, existing social VR experiences are mediated by inauthentic digital representations of the user (i.e., stylized avatars). These stylized representations have limited the adoption of social VR applications in precisely those [...]
Neural Volumes: Learning Dynamic Renderable Volumes from Images
Abstract: Modeling and rendering of dynamic scenes is challenging, as natural scenes often contain complex phenomena such as thin structures, evolving topology, translucency, scattering, occlusion, and biological motion. Mesh-based reconstruction and tracking often fail in these cases, and other approaches (e.g., light field video) typically rely on constrained viewing conditions, which limit interactivity. We [...]