Humans, hands, and horses: 3D reconstruction of articulated object categories using strong, weak, and self-supervision
Abstract: Reconstructing 3D objects from a single 2D image is a task that humans perform effortlessly, yet computer vision so far has only robustly solved 3D face reconstruction. In this talk we will see how we can extend the scope of monocular 3D reconstruction to more challenging, articulated categories such as human bodies, hands and [...]
Enabling Grounded Language Communication for Human-Robot Teaming
Abstract: The ability for robots to effectively understand natural language instructions and convey information about their observations and interactions with the physical world is highly dependent on the sophistication and fidelity of the robot’s representations of language, environment, and actions. As we progress towards more intelligent systems that perform a wider range of tasks in a [...]
Looking behind the Seen in Order to Anticipate
Abstract: Despite significant recent progress in computer vision and machine learning, personalized autonomous agents often still don’t participate robustly and safely across tasks in our environment. We think this is largely because they lack an ability to anticipate, which in turn is due to a missing understanding about what is happening behind the seen, i.e., [...]
Robots that Learn through Language
Abstract: Advances in perception have been integral to transitioning robots from machines restricted to factory automation to autonomous agents that operate robustly in unstructured environments. As our surrogates, robots enable people to explore the deepest depths of the ocean and distant regions of space, making discoveries that would otherwise be impossible. The age of robots [...]
Towards Reconstructing Any Object in 3D
Abstract: The world we live in is incredibly diverse, comprising of over 10k natural and man-made object categories. While the computer vision community has made impressive progress in classifying images from such diverse categories, the state-of-the-art 3D prediction systems are still limited to merely tens of object classes. A key reason for this stark difference [...]
Carnegie Mellon University
Beyond rigid objects: Data-driven Methods for Manipulation of Deformable Objects
Abstract: Manipulation of deformable objects challenges common assumptions made for rigid objects. Deformable objects have high intrinsic state representation and complex dynamics with high degrees of freedom, making it difficult for state estimation and planning. The completed work can be divided into two parts. In the first part, we explore reinforcement learning (RL) as a [...]
Carnegie Mellon University
Simulation, Perception, and Generation of Human Behavior
Abstract: Understanding and modeling human behavior is fundamental to almost any computer vision and robotics applications that involve humans. In this thesis, we take a holistic approach to human behavior modeling and tackle its three essential aspects --- simulation, perception, and generation. Throughout this thesis, we show how the three aspects are deeply connected and [...]
The Clinician’s AI Partner: Augmenting Clinician Capabilities Across the Spectrum of Healthcare
Abstract: Clinicians often work under highly demanding conditions to deliver complex care to patients. As our aging population grows and care becomes increasingly complex, physicians and nurses are now also experiencing feelings of burnout at unprecedented levels. In this talk, I will discuss possibilities for computer vision to function as a partner to clinicians, and to augment their capabilities, across [...]
The Unusual Effectiveness of Abstractions for Assistive AI
Abstract: Can we balance efficiency and reliability while designing assistive AI systems? What would such AI systems need to provide? In this talk I will present some of our recent work addressing these questions. In particular, I will show that a few fundamental principles of abstraction are surprisingly effective in designing efficient and reliable AI [...]
Reliable and Accessible Visual Recognition
Abstract: As visual recognition models are developed across diverse applications; we need the ability to reliably deploy our systems in a variety of environments. At the same time, visual models tend to be trained and evaluated on a static set of curated and annotated data which only represents a subset of the world. In this [...]