Seminar
Next-Generation Robot Perception: Hierarchical Representations, Certifiable Algorithms, and Self-Supervised Learning
Spatial perception —the robot’s ability to sense and understand the surrounding environment— is a key enabler for robot navigation, manipulation, and human-robot interaction. Recent advances in perception algorithms and systems have enabled robots to create large-scale geometric maps of unknown environments and detect objects of interest. Despite these advances, a large gap still separates robot [...]
Autonomous mobility in Mars exploration: recent achievements and future prospects
Abstract: This talk will summarize key recent advances in autonomous surface and aerial mobility for Mars exploration, then discuss potential future missions and technology needs for Mars and other planetary bodies. Among recent advances, the Perseverance rover that is now operating on Mars includes new autonomous navigation capability that dramatically increases its traverse speed over [...]
Structures and Environments for Generalist Agents
Abstract: We are entering an era of highly general AI, enabled by supervised models of the Internet. However, it remains an open question how intelligence emerged in the first place, before there was an Internet to imitate. Understanding the emergence of skillful behavior, without expert data to imitate, has been a longstanding goal of reinforcement [...]
From Videos to 4D Worlds and Beyond
Abstract: Abstract: The world underlying images and videos is 3-dimensional and dynamic, i.e. 4D, with people interacting with each other, objects, and the underlying scene. Even in videos of a static scene, there is always the camera moving about in the 4D world. Accurately recovering this information is essential for building systems that can reason [...]
Mars Robots and Robotics at NASA JPL
Abstract: In this seminar I’ll discuss Mars robots, the unprecedented results we’re seeing with the latest Mars mission, and how we got here. Perseverance’s manipulation and sampling systems have collected samples from unique locations at twice the rate of any prior mission. 88% of all driving has been autonomous. This has enabled the mission to [...]
Generative and Animatable Radiance Fields
Abstract: Generating and transforming content requires both creativity and skill. Creativity defines what is being created and why, while skill answers the question of how. While creativity is believed to be abundant, skill can often be a barrier to creativity. In our team, we aim to substantially reduce this barrier. Recent Generative AI methods have simplified the problem for 2D [...]
Generative modeling: from 3D scenes to fields and manifold
Abstract: In this keynote talk, we delve into some of our progress on generative models that are able to capture the distribution of intricate and realistic 3D scenes and fields. We explore a formulation of generative modeling that optimizes latent representations for disentangling radiance fields and camera poses, enabling both unconditional and conditional generation of 3D [...]
Estimating Robustness using Proxies
ABSTRACT: This talk covers some of our recent explorations on estimating the robustness of black-box machine learning models across data subpopulations. In other words, if a trained model is uniformly accurate across different types of inputs, or if there are significant performance disparities affecting the different subpopulations. Measuring such a characteristic is fairly straightforward if [...]
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
Abstract: In this talk, I will focus on presenting my recent work which will be presented at CVPR in less than two months. Text-guided image generation has progressed rapidly in recent years, inspiring major breakthroughs in text-guided shape generation. Recently, it has been shown that using score distillation, one can successfully text-guide a NeRF model to [...]
Navigating to Objects in the Real World
Abstract: Semantic navigation is necessary to deploy mobile robots in uncontrolled environments like our homes, schools, and hospitals. Many learning-based approaches have been proposed in response to the lack of semantic understanding of the classical pipeline for spatial navigation, which builds a geometric map using depth sensors and plans to reach point goals. Broadly, end-to-end [...]
Going Beyond Continual Learning: Towards Organic Lifelong Learning
Abstract: Supervised learning, the harbinger of machine learning over the last decade, has had tremendous impact across application domains in recent years. However, the notion of a static trained machine learning model is becoming increasingly limiting, as these models are deployed in changing and evolving environments. Among a few related settings, continual learning has gained significant [...]
Predictive Scene Representations for Embodied Visual Search
Abstract: My research advances embodied AI by developing large-scale datasets and state-of-the-art algorithms. In my talk, I will specifically focus on the embodied visual search problem, which aims to enable intelligent search for robots and augmented reality (AR) assistants. Embodied visual search manifests as the visual navigation problem in robotics, where a mobile agent must efficiently navigate [...]
Special RI Seminar
Title: Testing, Analysis, and Specification for Robust and Reliable Robot Software Abstract: Building robust and reliable robotic software is an inherently challenging feat that requires substantial expertise across a variety of disciplines. Despite that, writing robot software has never been easier thanks to software frameworks such as ROS: At its best, ROS allows newcomers to assemble simple, [...]
Generating Beautiful Pixels
Abstract: In this talk, I will present three experiments that use low-level image statistics to generate high-resolution detailed outputs. In the first experiment, I will use 2D pixels to efficiently mine hard examples for better learning. Simply biasing ray sampling towards hard ray examples enables learning of neural fields with more accurate high-frequency detail in less [...]
Towards Reliable Computer Vision Systems
Abstract: The real world has infinite visual variation – across viewpoints, time, space, and curation. As deep visual models become ubiquitous in high-stakes applications, their ability to generalize across such variation becomes increasingly important. In this talk, I will present opportunities to improve such generalization at different stages of the ML lifecycle: first, I will [...]
Transforming Hollywood Visual Effects with Graphics and Vision
Abstract: Paul will describe his path to developing visual effects technology used in hundreds of movies, including The Matrix, Spider-Man 2, Benjamin Button, Avatar, Maleficent, Furious 7, and Blade Runner: 2049. These techniques include image-based modeling and rendering, high dynamic range imaging, image-based lighting, and high-resolution facial scanning for photoreal digital actors. Paul will also [...]
Vision without labels
Abstract: Deep learning has revolutionized all aspects of computer vision, but its successes have come from supervised learning at scale: large models trained on ever larger labeled datasets. However this reliance on labels makes these systems fragile when it comes to new scenarios or new tasks where labels are unavailable. This is in stark contrast to [...]
Learning Meets Gravity: Robots that Learn to Embrace Dynamics from Data
Abstract: Despite the incredible capabilities (speed and repeatability) of our hardware today, many robot manipulators are deliberately programmed to avoid dynamics – moving slow enough so they can adhere to quasi-static assumptions of the world. In contrast, people frequently (and subconsciously) make use of dynamic phenomena to manipulate everyday objects – from unfurling blankets, to [...]
Large Multimodal (Vision-Language) Models for Image Generation and Understanding
Abstract: Large Language Models and Large Vision Models, also known as Foundation Models, have led to unprecedented advances in language understanding, visual understanding, and AI. In particular, many computer vision problems including image classification, object detection, and image generation have benefited from the capabilities of such models trained on internet-scale text and visual data. In [...]
Learning and Control for Safety, Efficiency, and Resiliency of Embodied AI
Abstract: The rapid evolution of ubiquitous sensing, communication, and computation technologies has revolutionized of cyber-physical systems (CPS) across virous domains like robotics, smart grids, aerospace, and smart cities. Integrating learning into dynamic systems control presents significant Embodied AI opportunities. However, current decision-making frameworks lack comprehensive understanding of the tridirectional relationship among communication, learning and control, [...]
Imaginative Vision Language Models: Towards human-level imaginative AI skills transforming species discovery, content creation, self-driving cars, and emotional health
Abstract: Most existing AI learning methods can be categorized into supervised, semi-supervised, and unsupervised methods. These approaches rely on defining empirical risks or losses on the provided labeled and/or unlabeled data. Beyond extracting learning signals from labeled/unlabeled training data, we will reflect in this talk on a class of methods that can learn beyond the vocabulary [...]
World Knowledge in the Time of Large Models
Abstract: This talk will discuss the massive shift that has come about in the vision and ML community as a result of the large pre-trained language and language and vision models such as Flamingo, GPT-4, and other models. We begin by looking at the work on knowledge-based systems in CV and robotics before the large model [...]
Data-Efficient Learning for Robotics and Reinforcement Learning
Abstract: Data efficiency, i.e., learning from small datasets, is of practical importance in many real-world applications and decision-making systems. Data efficiency can be achieved in multiple ways, such as probabilistic modeling, where models and predictions are equipped with meaningful uncertainty estimates, transfer learning, or the incorporation of valuable prior knowledge. In this talk, I will [...]
Digital Human Modeling with Light
Abstract: Leveraging light in various ways, we can observe and model physical phenomena or states which may not be possible to observe otherwise. In this talk, I will introduce our recent exploration on digital human modeling with different types of light. First, I will present our recent work on the modeling of relightable human heads, [...]
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis
Abstract: We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements. We follow an analysis-by-synthesis framework, inspired by recent work that models scenes as a collection of 3D Gaussians which are optimized to reconstruct input images via differentiable rendering. To model [...]
Robots at the Johnson Space Center and Future Plans
Abstract: The seminar will review a series of robotic systems built at the Johnson Space Center over the last 20 years. These will include wearable robots (exoskeletons, powered gloves and jetpacks), manipulation systems (ISS cranes down to human scale) and lunar mobility systems (human surface mobility and robotic rovers). As all robotics presentations should, this [...]
Biometrics in a Deep Learning World
Abstract: Biometrics is the science of recognizing individuals based on their physical and behavioral attributes such as fingerprints, face, iris, voice and gait. The past decade has witnessed tremendous progress in this field, including the deployment of biometric solutions in diverse applications such as border security, national ID cards, amusement parks, access control, and smartphones. [...]
Neural World Models
Abstract: Computer vision researchers have pushed the limits of performance in perception tasks involving natural images to near saturation. With self-supervised inference driven by recent advancements in generative modeling, it can be debated that the era of large image models is coming to a close, ushering in an era focused on video. However, it's worth [...]
Becoming Teammates: Designing Assistive, Collaborative Machines
Abstract: The growing power in computing and AI promises a near-term future of human-machine teamwork. In this talk, I will present my research group’s efforts in understanding the complex dynamics of human-machine interaction and designing intelligent machines aimed to assist and collaborate with people. I will focus on 1) tools for onboarding machine teammates and [...]
Reconstructing 3D Humans from Visual Data
Abstract: Abstract: Understanding humans in visual content is fundamental for numerous computer vision applications. Extensive research has been conducted in the field of human pose estimation (HPE) to accurately locate joints and construct body representations from images and videos. Expanding on HPE, human mesh recovery (HMR) addresses the more complex task of estimating the 3D pose [...]
Towards Energy-Efficient Techniques and Applications for Universal AI Implementation
Abstract: The rapid advancement of large-scale language and vision models has significantly propelled the AI domain. We now see AI enriching everyday life in numerous ways – from community and shared virtual reality experiences to autonomous vehicles, healthcare innovations, and accessibility technologies, among others. Central to these developments is the real-time implementation of high-quality deep [...]
Structure-from-Motion Meets Self-supervised Learning
Abstract: How to teach machine to perceive 3D world from unlabeled videos? We will present new solution via incorporating Structure-from-Motion (SfM) into self-supervised model learning. Given RGB inputs, deep models learn to regress depth and correspondence. With the two inputs, we introduce a camera localization algorithm that searches for certified global optimal poses. However, the [...]
Toward Human-Centered XR: Bridging Cognition and Computation
Abstract: Virtual and Augmented Reality enables unprecedented possibilities for displaying virtual content, sensing physical surroundings, and tracking human behaviors with high fidelity. However, we still haven't created "superhumans" who can outperform what we are in physical reality, nor a "perfect" XR system that delivers infinite battery life or realistic sensation. In this talk, I will discuss some of our [...]
Carnegie Mellon Graphics Colloquium: C. Karen Liu : Building Large Models for Human Motion
Building Large Models for Human Motion Large generative models for human motion, analogous to ChatGPT for text, will enable human motion synthesis and prediction for a wide range of applications such as character animation, humanoid robots, AR/VR motion tracking, and healthcare. This model would generate diverse, realistic human motions and behaviors, including kinematics and dynamics, [...]
Teaching a Robot to Perform Surgery: From 3D Image Understanding to Deformable Manipulation
Abstract: Robot manipulation of rigid household objects and environments has made massive strides in the past few years due to the achievements in computer vision and reinforcement learning communities. One area that has taken off at a slower pace is in manipulating deformable objects. For example, surgical robotics are used today via teleoperation from a [...]
Zeros for Data Science
Abstract: The world around us is neither totally regular nor completely random. Our and robots’ reliance on spatiotemporal patterns in daily life cannot be over-stressed, given the fact that most of us can function (perceive, recognize, navigate) effectively in chaotic and previously unseen physical, social and digital worlds. Data science has been promoted and practiced [...]
Emotion perception: progress, challenges, and use cases
Abstract: One of the challenges Human-Centric AI systems face is understanding human behavior and emotions considering the context in which they take place. For example, current computer vision approaches for recognizing human emotions usually focus on facial movements and often ignore the context in which the facial movements take place. In this presentation, I will [...]