Seminar
Building Intelligent and Visceral Machines: From Sensing to Application
Abstract: Humans have evolved to have highly adaptive behaviors that help us survive and thrive. As AI prompts a move from computing interfaces that are explicit and procedural to those that are implicit and intelligent, we are presented with extraordinary opportunities. In this talk, I will argue that understanding affective and behavioral signals presents many opportunities [...]
GANcraft – an unsupervised 3D neural method for world-to-world translation
Abstract: Advances in 2D image-to-image translation methods, such as SPADE/GauGAN, have enabled users to paint photorealistic images by drawing simple sketches similar to those created in Microsoft Paint. Despite these innovations, creating a realistic 3D scene remains a painstaking task, out of the reach of most people. It requires years of expertise, professional software, a library [...]
Learning Optical Flow: Model, Data, and Applications
Abstract: Optical flow provides important information about the dynamic world and is of fundamental importance to many tasks. In this talk, I will present my work on different aspects of learning optical flow. I will start with the background and talk about PWC-Net, a compact and effective model built using classical principles for optical flow. Next, [...]
Distributed Dissipativity: Applying Foundational Stability Theory to Modern Networked Control
Abstract: Despite its diverse areas of application, the desire to optimize performance and guarantee acceptable behaviour in the face of inevitable uncertainty is pervasive throughout control theory. This creates a fundamental challenge since the necessity of robustly stable control schemes often favors conservative designs, while the desire to optimize performance typically demands the opposite. While [...]
Haptic Perspective-taking from Vision and Force
Abstract: Physically collaborative robots present an opportunity to positively impact society across many domains. However, robots currently lack the ability to infer how their actions physically affect people. This is especially true for robotic caregiving tasks that involve manipulating deformable cloth around the human body, such as dressing and bathing assistance. In this talk, I [...]
Do Vision-Language Pretrained Models Learn Spatiotemporal Primitive Concepts?
Abstract: Vision-language models pretrained on web-scale data have revolutionized deep learning in the last few years. They have demonstrated strong transfer learning performance on a wide range of tasks, even under the "zero-shot" setup, where text "prompts" serve as a natural interface for humans to specify a task, as opposed to collecting labeled data. These models are [...]
Perception-Action Synergy in Uncertain Environments
Abstract: Many robotic applications require a robot to operate in an environment with unknowns or uncertainty, at least initially, before it gathers enough information about the environment. In such a case, a robot must rely on sensing and perception to feel its way around. Moreover, it has to couple sensing/perception and motion synergistically in real [...]
Max-Affine Spline Insights into Deep Learning
Abstract: We build a rigorous bridge between deep networks (DNs) and approximation theory via spline functions and operators. Our key result is that a large class of DNs can be written as a composition of max-affine spline operators (MASOs) that provide a powerful portal through which we view and analyze their inner workings. For instance, [...]
Teruko Yata Memorial Lecture
Leveraging Language and Video Demonstrations for Learning Robot Manipulation Skills and Enabling Closed-Loop Task Planning Humans have gradually developed language, mastered complex motor skills, created and utilized sophisticated tools. The act of conceptualization is fundamental to these abilities because it allows humans to mentally represent, summarize and abstract diverse knowledge and skills. By means of [...]
Designing Robotic Systems with Collective Embodied Intelligence
Abstract: Natural swarms exhibit sophisticated colony-level behaviors with remarkable scalability and error tolerance. Their evolutionary success stems from more than just intelligent individuals, it hinges on their morphology, their physical interactions, and the way they shape and leverage their environment. Mound-building termites, for instance, are believed to use their own body as a template for [...]
Understanding 3D Scenes and Interacting Hands
Abstract: Abstract: The long-term goal of my research is to help computers understand the physical world from images, including both 3D properties and how humans or robots could interact with things. This talk will summarize two recent directions aimed at enabling this goal. I will begin with learning to reconstruct full 3D scenes, including [...]
Snakes & Spiders, Robots & Geometry
Abstract: Locomotion and perception are a common thread between robotics and biology. Understanding these phenomena at a mechanical level involves nonlinear dynamics and the coordination of many degrees of freedom. In this talk, I will discuss geometric approaches to organizing this information in two problem domains: Undulatory locomotion of snakes and swimmers, and vibration propagation [...]
Multimodal Modeling: Learning Beyond Visual Knowledge
Abstract: The computer vision community has embraced the success of learning specialist models by training with a fixed set of predetermined object categories, such as ImageNet or COCO. However, learning only from visual knowledge might hinder the flexibility and generality of visual models, which requires additional labeled data to specify any other visual concept and [...]
Robotic Cave Exploration for Search, Science, and Survey
Abstract: Robotic cave exploration has the potential to create significant societal impact through facilitating search and rescue, in the fight against antibiotic resistance (science), and via mapping (survey). But many state-of-the-art approaches for active perception and autonomy in subterranean environments rely on disparate perceptual pipelines (e.g., pose estimation, occupancy modeling, hazard detection) that process the same underlying sensor data in different [...]
Audio-Visual Learning for Social Telepresence
Abstract Relationships between people are strongly influenced by distance. Even with today’s technology, remote communication is limited to a two-dimensional audio-visual experience and lacks the availability of a shared, three-dimensional space in which people can interact with each other over the distance. Our mission at Reality Labs Research (RLR) in Pittsburgh is to develop such [...]
Representations in Robot Manipulation: Learning to Manipulate Ropes, Fabrics, Bags, and Liquids
Abstract: The robotics community has seen significant progress in applying machine learning for robot manipulation. However, much manipulation research focuses on rigid objects instead of highly deformable objects such as ropes, fabrics, bags, and liquids, which pose challenges due to their complex configuration spaces, dynamics, and self-occlusions. To achieve greater progress in robot manipulation of [...]
Safe and Stable Learning for Agile Robots without Reinforcement Learning
Abstract: My research group (https://aerospacerobotics.caltech.edu/) is working to systematically leverage AI and Machine Learning techniques towards achieving safe and stable autonomy of safety-critical robotic systems, such as robot swarms and autonomous flying cars. Another example is LEONARDO, the world's first bipedal robot that can walk, fly, slackline, and skateboard. Stability and safety are often research problems [...]
Towards editable indoor lighting estimation
Abstract: Combining virtual and real visual elements into a single, realistic image requires the accurate estimation of the lighting conditions of the real scene. In recent years, several approaches of increasing complexity---ranging from simple encoder-decoder architecture to more sophisticated volumetric neural rendering---have been proposed. While the quality of automatic estimates has increased, they have the unfortunate downside [...]
Computational imaging with multiply scattered photons
Abstract: Computational imaging has advanced to a point where the next significant milestone is to image in the presence of multiply-scattered light. Though traditionally treated as noise, multiply-scattered light carries information that can enable previously impossible imaging capabilities, such as imaging around corners and deep inside tissue. The combinatorial complexity of multiply-scattered light transport makes [...]
Towards $1 robots
Abstract: Robots are pretty great -- they can make some hard tasks easy, some dangerous tasks safe, or some unthinkable tasks possible. And they're just plain fun to boot. But how many robots have you interacted with recently? And where do you think that puts you compared to the rest of the world's people? In [...]
Mental models for 3D modeling and generation
Abstract: Humans have extraordinary capabilities of comprehending and reasoning about our 3D visual world. One particular reason is that when looking at an object or a scene, not only can we see the visible surface, but we can also hallucinate the invisible parts - the amodal structure, appearance, affordance, etc. We have accumulated thousands of [...]
What (else) can you do with a robotics degree?
Abstract: In 2004, half-way through my robotics Ph.D., I had a panic-inducing thought: What if I don’t want to build robots for the rest of my life? What can I do with this degree?! Nearly twenty years later, I have some answers: tackle climate change in Latin America, educate Congress about autonomous vehicles, improve how [...]
Complete Codec Telepresence
Abstract: Imagine two people, each of them within their own home, being able to communicate and interact virtually with each other as if they are both present in the same shared physical space. Enabling such an experience, i.e., building a telepresence system that is indistinguishable from reality, is one of the goals of Reality Labs [...]