Robot Learning, Wearable Sensing, and Teleoperation in Pursuit of Robotic Caregivers
Abstract Designing safe and reliable robotic assistance for caregiving is a grand challenge in robotics. A sixth of the United States population is over the age of 65 and in 2014 more than a quarter of the population had a disability. Robotic caregivers could positively benefit society; yet, physical robotic assistance presents several challenges and [...]
Personalized Context-aware Affective Nonverbal Robot Feedback
Abstract: We first consider the problem of estimating context, specifically key features of the human state. We predict engagement-related events in an educational activity before the end of that activity, which could allow the robot to provide feedback early enough to improve the human's experience. We then explore generating nonverbal affective robot behavior by correlating [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Redefining the Perception-Action Interface: Visual Action Representations for Contact-Centric Manipulation
Abstract: In robotics, understanding the link between perception and action is pivotal. Typically, perception systems process sensory data into state representations like segmentations and bounding boxes, which a planner uses to plan actions. However, this state estimation approach can fail in environments with partial observability, and in cases with challenging object properties like transparency and deformability. [...]
RI Picnic
The RI Picnic will be held at the Vietnam Veteran's Pavilion @ Schenley Park on Overlook Drive, Tuesday, August 29, 1-7pm. SOCIALIZE, EAT, DRINK & BE MERRY! Receive this year's RI giveaway item; witness the exciting final rounds of the annual RI croquet tournament; enjoy lawn games right at our own pavilion area. Plan to spend some time at the [...]
Continual Robot Learning: Benchmarks and Modular Methods
Abstract: Humans adapt continuously to the world around us, allowing us to acquire new skills and explore diverse environments seamlessly. Current AI methods, however, cannot attain this versatility. Instead, they are typically trained with vast datasets, and learn all tasks simultaneously. However, the trained models have limited ability to adapt to changing contexts, and are [...]
Architecture and Algorithms for Space-Based Global Wildlife Tracking
Abstract: Accurate satellite based positioning revolutionized several industries over the past two decades from agriculture to transportation. However, conventional GNSS receivers consume significant amounts of energy and are too large for many applications, including wildlife-tracking which is critical for conservation efforts and improving our understanding of the global climate. To address this capability gap, we [...]
Multi-Human 3D Reconstruction from Monocular Videos
Abstract: We study the problem of multi-human 3D reconstruction from videos captured in the wild. Human movements are dynamic, and accurately reconstructing them in various settings is crucial for developing immersive social telepresence, assistive humanoid robots, and augmented reality systems. However, creating such a system requires addressing fundamental issues with previous works regarding the data [...]
Language-Conditioned Object Detection and Manipulation
Abstract: Traditional object detection methods are often confined to predefined object vocabularies, limiting their versatility in real-world scenarios where robots need to understand and execute diverse household tasks. Additionally, the 2D and 3D perception communities have typically pursued separate approaches tailored to their respective domains. In this thesis, we present a language-conditioned object detector with [...]
How I Learned to Love Blobs: The Power of Gaussian Representations in Differentiable Rendering and Optimization
Abstract: In this thesis, we explore the use of Gaussian Representations in multiple application areas of computer vision and robotics. In particular, we design a ray-based differentiable renderer for 3D Gaussians that can be used to solve multiple classic computer vision problems in a unified manner. For example, we can reconstruct 3D shapes from color, [...]
Watch, Practice, Improve: Towards In-the-wild Manipulation
Abstract: The longstanding dream of many roboticists is to see robots perform diverse tasks in diverse environments. To build such a robot that can operate anywhere, many methods train on robotic interaction data. While these approaches have led to significant advances, they rely on heavily engineered setups or high amounts of supervision, neither of which [...]
Generating Beautiful Pixels
Abstract: In this talk, I will present three experiments that use low-level image statistics to generate high-resolution detailed outputs. In the first experiment, I will use 2D pixels to efficiently mine hard examples for better learning. Simply biasing ray sampling towards hard ray examples enables learning of neural fields with more accurate high-frequency detail in less [...]
Towards Reliable Computer Vision Systems
Abstract: The real world has infinite visual variation – across viewpoints, time, space, and curation. As deep visual models become ubiquitous in high-stakes applications, their ability to generalize across such variation becomes increasingly important. In this talk, I will present opportunities to improve such generalization at different stages of the ML lifecycle: first, I will [...]
Towards Photorealistic Dynamic Capture and Animation of Human Hair and Head
Abstract: Realistic human avatars play a key role in immersive virtual telepresence. To reach a high level of realism, a human avatar needs to faithfully reflect human appearance. A human avatar should also be drivable and express natural motions. Existing works have made significant progress in building drivable realistic face avatars, but they rarely include [...]
Modeling Dynamic Clothing for Data-Driven Photorealistic Avatars
Abstract: In this thesis, we aim to build photorealistic animatable avatars of humans wearing complex clothing in a data-driven manner. Such avatars will be a critical technology to enable future applications such as immersive telepresence in Virtual Reality (VR) and Augmented Reality (AR). Existing full-body avatars that jointly model geometry and view-dependent texture using Variational [...]
Manipulation Among Movable Objects for Pick-and-Place Tasks in Cluttered 3D Workspaces
Abstract: In cluttered real-world workspaces, simple pick-and-place tasks for robot manipulators can be quite challenging to solve. Often there is no collision-free trajectory that allows the robot to grasp and extract a desired object from the scene. This requires motion planning algorithms to reason about rearranging some of the “movable” clutter in the scene so [...]
Transforming Hollywood Visual Effects with Graphics and Vision
Abstract: Paul will describe his path to developing visual effects technology used in hundreds of movies, including The Matrix, Spider-Man 2, Benjamin Button, Avatar, Maleficent, Furious 7, and Blade Runner: 2049. These techniques include image-based modeling and rendering, high dynamic range imaging, image-based lighting, and high-resolution facial scanning for photoreal digital actors. Paul will also [...]
Vision without labels
Abstract: Deep learning has revolutionized all aspects of computer vision, but its successes have come from supervised learning at scale: large models trained on ever larger labeled datasets. However this reliance on labels makes these systems fragile when it comes to new scenarios or new tasks where labels are unavailable. This is in stark contrast to [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Learning Meets Gravity: Robots that Learn to Embrace Dynamics from Data
Abstract: Despite the incredible capabilities (speed and repeatability) of our hardware today, many robot manipulators are deliberately programmed to avoid dynamics – moving slow enough so they can adhere to quasi-static assumptions of the world. In contrast, people frequently (and subconsciously) make use of dynamic phenomena to manipulate everyday objects – from unfurling blankets, to [...]
Large Multimodal (Vision-Language) Models for Image Generation and Understanding
Abstract: Large Language Models and Large Vision Models, also known as Foundation Models, have led to unprecedented advances in language understanding, visual understanding, and AI. In particular, many computer vision problems including image classification, object detection, and image generation have benefited from the capabilities of such models trained on internet-scale text and visual data. In [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Learning and Control for Safety, Efficiency, and Resiliency of Embodied AI
Abstract: The rapid evolution of ubiquitous sensing, communication, and computation technologies has revolutionized of cyber-physical systems (CPS) across virous domains like robotics, smart grids, aerospace, and smart cities. Integrating learning into dynamic systems control presents significant Embodied AI opportunities. However, current decision-making frameworks lack comprehensive understanding of the tridirectional relationship among communication, learning and control, [...]
Generalizable Dexterity with Reinforcement Learning
Abstract: Dexterity, the ability to perform complex interactions with the physical world, is at the core of robotics. However, existing research in robot manipulation has been focused on tasks that involve limited dexterity, such as pick-and-place. The motor skills of the robots are often quasi-static, have a predefined or limited sequence of contact events, and [...]
Imaginative Vision Language Models: Towards human-level imaginative AI skills transforming species discovery, content creation, self-driving cars, and emotional health
Abstract: Most existing AI learning methods can be categorized into supervised, semi-supervised, and unsupervised methods. These approaches rely on defining empirical risks or losses on the provided labeled and/or unlabeled data. Beyond extracting learning signals from labeled/unlabeled training data, we will reflect in this talk on a class of methods that can learn beyond the vocabulary [...]
World Knowledge in the Time of Large Models
Abstract: This talk will discuss the massive shift that has come about in the vision and ML community as a result of the large pre-trained language and language and vision models such as Flamingo, GPT-4, and other models. We begin by looking at the work on knowledge-based systems in CV and robotics before the large model [...]
Data-Efficient Learning for Robotics and Reinforcement Learning
Abstract: Data efficiency, i.e., learning from small datasets, is of practical importance in many real-world applications and decision-making systems. Data efficiency can be achieved in multiple ways, such as probabilistic modeling, where models and predictions are equipped with meaningful uncertainty estimates, transfer learning, or the incorporation of valuable prior knowledge. In this talk, I will [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Digital Human Modeling with Light
Abstract: Leveraging light in various ways, we can observe and model physical phenomena or states which may not be possible to observe otherwise. In this talk, I will introduce our recent exploration on digital human modeling with different types of light. First, I will present our recent work on the modeling of relightable human heads, [...]
Preference Based Optimization of Multi-Objective Robot Performance
Abstract: Robotic systems often require that tradeoffs be made--for example, between performance and robustness, power and longevity, or efficiency and safety. While roboticists can design cost functions with hand-picked weights for different metrics, it is not always a straightforward task, particularly when some aspects of performance are not easily quantified. This can occur especially when [...]
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis
Abstract: We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements. We follow an analysis-by-synthesis framework, inspired by recent work that models scenes as a collection of 3D Gaussians which are optimized to reconstruct input images via differentiable rendering. To model [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Ensuring safety for uncertain high-dimensional robotic systems
Abstract: Two major obstacles for safe control and planning are (1) scaling to high-dimensional systems and (2) handling uncertain systems. This is problematic because such systems are ubiquitous in practice: e.g. drones with unknown drag, manipulators carrying unknown packages. In this proposal, we aim to address both challenges. At the control level, we have synthesized [...]
Trustworthy Learning using Uncertain Interpretation of Data
Abstract: Non-parametric models are popular in real-world applications of machine learning. However, many modern ML methods that ensure that models are pragmatic, safe, robust, fair, and otherwise trustworthy in increasingly critical applications, assume parametric, differentiable models. We show that, by interpreting data as locally uncertain, we can achieve many of these without being limited to [...]
Allocation, Planning, and Control in Off-road Automated Convoy Operations
Abstract: The lack of structure in off-road terrains makes off-road operations of automated platforms difficult. The difficulty arises from uncertainty in the optimality and safety of the actions (e.g., planning and control) taken by the automated platform. When multiple automated platforms are required to act in a coordinated manner (e.g., a convoy) in complex cluttered [...]
Robot Learning for Assistive Dressing
Abstract: Robot-assisted dressing could benefit the lives of many people such as older adults and individuals with disabilities. In this talk, I will present two pieces of work that use robot learning for this assistive task. In the first half of the talk, I will present our work on developing a robot-assisted dressing system that [...]
RI Faculty Meeting: Multi-Robot Field Autonomy: A 5 Year Perspective
LIVE DEMO! Come see, hear and witness progress made in developing a heterogeneous (wheeled, legged, etc.) team of field deployable mobile robots. Details will be shared on the history of development of multi-robot autonomy at CMU throughout the previous DARPA Subterranean Challenge, DARPA RACER program, and current ARL projects. There will be an ongoing live and interactive [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Robots at the Johnson Space Center and Future Plans
Abstract: The seminar will review a series of robotic systems built at the Johnson Space Center over the last 20 years. These will include wearable robots (exoskeletons, powered gloves and jetpacks), manipulation systems (ISS cranes down to human scale) and lunar mobility systems (human surface mobility and robotic rovers). As all robotics presentations should, this [...]
Biometrics in a Deep Learning World
Abstract: Biometrics is the science of recognizing individuals based on their physical and behavioral attributes such as fingerprints, face, iris, voice and gait. The past decade has witnessed tremendous progress in this field, including the deployment of biometric solutions in diverse applications such as border security, national ID cards, amusement parks, access control, and smartphones. [...]
Towards Robotic Tree Manipulation: Leveraging Graph Representations
Abstract: There is growing interest in automating agricultural tasks that require intricate and precise interaction with specialty crops, such as trees and vines. However, developing robotic solutions for crop manipulation remains a difficult challenge due to complexities involved in modeling their deformable behavior. In this study, we present a framework for learning the deformation behavior [...]
Tracking Any”Thing” in Videos
Abstract: Being able to track anything is one of the fundamental steps to parse and understand a video. In this talk, I will present two pieces of work that tackle this problem at different spatial granularities. In the first half of the talk, I will discuss tracking any video pixel or particle through time in [...]
Exploring Diverse Interaction Types for Human in the Loop Robot Learning
Abstract: Teaching sessions between humans and robots will need to be maximally informative for optimal robot learning and to ease the human’s teaching burden. However, the bulk of prior work considers one or two modalities through which a human can convey information to a robot—namely, kinesthetic demonstrations and preference queries. Moreover, people will teach robots [...]
Learning Generalizable Robot Skills for Dynamic and Interactive Tasks
Abstract: Enabling robots to perform complex dynamic tasks such as picking up an object in one sweeping motion or pushing off a wall to quickly turn a corner is a challenging problem. The dynamic interactions implicit in these tasks are critical for successful task execution. Furthermore, given the interactive nature of such tasks, safety, in [...]
Customizing Large-scale Text-to-Image Models
Abstract: Advancements in large-scale generative models represent a watershed moment. These models can generate a wide variety of objects and scenes with different styles and compositions. However, these models are trained on a fixed snapshot of available data and often contain copyrighted or private images. This assumption makes them lacking in two aspects – (a) [...]
Building Robot Hands and Teaching Dexterity
Abstract: Our shared dream is to have robot humanoids with hands complete similar tasks that humans do. While there are a few robot hands available today, the popular opinion is that they are difficult to use, expensive, and hard to obtain which precludes their ubiquitous usage. We argue that this is not an inherent problem [...]
Neural World Models
Abstract: Computer vision researchers have pushed the limits of performance in perception tasks involving natural images to near saturation. With self-supervised inference driven by recent advancements in generative modeling, it can be debated that the era of large image models is coming to a close, ushering in an era focused on video. However, it's worth [...]
How to Design Robotic Hands That Wield Tools
Abstract: Tool manipulation is an essential human skill. It extends our manipulation capability beyond the capability of the biological hand, and is a defining feature of many important jobs centered on physical interaction with the real world. Yet, wielding a tool is drastically different from generally grasping an object. The prime examples are pens and [...]
Becoming Teammates: Designing Assistive, Collaborative Machines
Abstract: The growing power in computing and AI promises a near-term future of human-machine teamwork. In this talk, I will present my research group’s efforts in understanding the complex dynamics of human-machine interaction and designing intelligent machines aimed to assist and collaborate with people. I will focus on 1) tools for onboarding machine teammates and [...]
Robotics Institute Winter Party
Please join us for some fun, food, beverages and conversation! All RI faculty, staff, students and visitors are invited to the Robotics Institute Winter Party! We apologize but due to space limitations in the Atrium we regretfully cannot include family or other non-RI guests.
Learning Local Heuristics in Heuristic Search
Abstract: Motion planning is a fundamental problem in robotics; how can we move robots efficiently and safely? Motion planning can be solved using several paradigms with their own strengths and weaknesses. This talk dives into Heuristic Graph Search and its application to motion planning by converting it to a problem of finding a start-goal path [...]
Low-Cost Multimodal Sensing and Dexterity for Deformable Object Manipulation
Abstract: To integrate robots seamlessly into daily life, they must be able to handle a variety of tasks in diverse environments, like assisting in hospitals or cooking in kitchens. Many of the items in these environments are deformable such as bedding in hospitals or vegetables in kitchens, and a certain level of dexterity is necessary [...]
Joint 2D and 3D Semi-Supervised Object Detection
Abstract: While numerous 3D detection works leverage the complementary relationship between RGB images and point clouds, developments in the broader framework of semi-supervised object recognition remain uninfluenced by multi-modal fusion. Current methods develop independent pipelines for 2D and 3D semi-supervised learning despite the availability of paired image and point cloud frames. Observing that the distinct [...]
New Methods for Satellite Control
Abstract: Since 2003, the number of satellites launched into orbit has grown from 100 per year to over 2000 per year. Over that same timeframe, incredible advances have been made in control systems for terrestrial robotics and autonomy. Despite the increased quantity of satellites in orbit and the advances made in terrestrial control systems, satellite [...]
[MSR Thesis Talk] Development and Testing of a Software Stack for an Autonomous Racing Vehicle
Abstract: Autonomous racing aims to replicate the human racecar driver with software and sensors. As in traditional motorsports, Autonomous Racing Vehicles (ARVs) are pushed to their dynamic limits in multi-agent scenarios at high (>= 100mph) speeds. This Operational Design Domain (ODD) presents unique challenges across the autonomy stack. The Indy Autonomous Challenge (IAC) is an [...]
[MSR Thesis Talk] Kitchen Robot Case Studies: Learning Manipulation Tasks from Human Video Demonstrations
Abstract: The vision of integrating a robot into the kitchen, capable of acting as a chef, remains a sought-after goal in robotics. Current robotic systems, mostly programmed for specific tasks, fall short in versatility and adaptability to a diverse culinary environment. While significant progress has been made in robotic learning, with advancements in behavior cloning, [...]
Towards Agile Robotics: Creating Push-Off Skills for Dynamic Interactions
Abstract: Dynamic interactions play a fundamental role in human capabilities, enabling us to achieve a wide range of tasks such as moving heavy objects, manipulating our surroundings, and changing directions rapidly and safely. In contrast, most conventional robotic systems lack this level of agility and cannot perform dynamic interactions, limiting their potential in practical applications. [...]
Learning Safe Human-Robot Interactions for a Seamlessly Shared Airspace
Abstract: The growing need for fully autonomous aerial operations in shared spaces, necessitates the development of reliable agents capable of navigating safely and seamlessly alongside uncertain human agents. In response, we advocate endowing autonomous agents with the ability to predict human actions, comprehend and ground abstract rules in the action space, and embrace the uncertainty [...]
Generative Evolutionary Search with Diffusion Models for Trajectory Optimization
Abstract: Diffusion models excel at modeling complex and multimodal trajectory distributions for decision-making and control. Reward-gradient guided denoising has been recently proposed to generate trajectories that maximize both a differentiable reward function and the likelihood under the data distribution captured by a diffusion model. Reward-gradient guided denoising requires a differentiable reward function fitted to both [...]
Tartancalib: Iterative Wide-Angle Lens Calibration
Abstract: Mobile vision systems greatly benefit from the large field-of-view enabled by wide-angle lenses. Accurate and robust intrinsic calibration is a critical prerequisite for leveraging this property. Calibrating wide-angle lenses with current state-of-the-art techniques yields poor results due to extreme distortion at the edge. In this work, we present TartanCalib, an accurate and robust method [...]
Sample-Efficient Reinforcement Learning with applications in Nuclear Fusion
Abstract: In many practical applications of reinforcement learning (RL), it is expensive to observe state transitions from the environment. In the problem of plasma control for nuclear fusion, the motivating example of this thesis, determining the next state for a given state-action pair requires querying an expensive transition function which can lead to many hours [...]
[MSR Thesis Talk] Neural Implicit Representations for Medical Ultrasound Volumes and 3D Anatomy-specific Reconstructions
Abstract: Most Robotic Ultrasound Systems (RUSs) equipped with ultrasound-interpreting algorithms rely on building 3D reconstructions of the entire scanned region or specific anatomies. These 3D reconstructions are typically created via methods that compound or stack 2D tomographic ultrasound images using known poses of the ultrasound transducer with the latter requiring 2D or 3D segmentation. While fast, this class [...]
Social Navigation with Pedestrian Groups
Abstract: Autonomous navigation in human crowds (i.e., social navigation) presents several challenges: The robot often needs to rely on its noisy sensors to identify and localize pedestrians in human crowds; the robot needs to plan efficient paths to reach its goals; the robot needs to do so in a safe and socially appropriate manner. Recent [...]
Zero-Shot Video Question Answering with Procedural Programs
Abstract: We propose to answer zero-shot questions about videos by generating short procedural programs that derive a final answer from solving a sequence of visual subtasks. We present Procedural Video Querying (ProViQ), which uses a large language model to generate such programs from an input question and an API of visual modules in the prompt, [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
[MSR Thesis Talk] Enhancing RHex Robot Performance with Innovative Bioplastic Legs Responsive to Humidity
Abstract: Designing and developing robots that can effectively navigate real-world environments poses a significant challenge. To overcome this, many robotic systems draw inspiration from the adaptive behaviors of animals, which have evolved to thrive in diverse surroundings. Amphibious animals, for instance, seamlessly transition between walking and swimming, optimizing their locomotion efficiency based on environmental cues. [...]
Informative Path Planning Toward Autonomous Real-World Applications
Abstract: Gathering information from the physical world plays a crucial role in many applications—whether it be scientific research, environmental monitoring, search and rescue, defense, or disaster response. The utilization of robots for information gathering allows for the leveraging of intelligent algorithms to efficiently collect data, providing critical insights and facilitating informed decision-making. These autonomous robots [...]
Alignment for Vision-Language Foundation Model
Abstract: Recent advancements in vision-language foundation models, exemplified by GPT4-Vision and DALL-E 3, have significantly transformed both research and practical applications, ranging from professional assistance to content creation. However, aligning them precisely with specific user goals presents a notable challenge. This thesis introduces innovative strategies for improving this alignment. I will first introduce our novel [...]
Efficient Sensor Coverage in Complex Environments
Abstract: This thesis develops sensor coverage algorithms for mobile robots that are scalable to large and complex environments. The core challenge is computing the shortest paths that can direct one or more robots to sweep onboard sensors over all accessible surfaces within an environment. This problem resembles the watchman route problem that is known to [...]
Reconstructing 3D Humans from Visual Data
Abstract: Abstract: Understanding humans in visual content is fundamental for numerous computer vision applications. Extensive research has been conducted in the field of human pose estimation (HPE) to accurately locate joints and construct body representations from images and videos. Expanding on HPE, human mesh recovery (HMR) addresses the more complex task of estimating the 3D pose [...]
Improving Kalman Filter-based Multi-Object Tracking in Occlusion and Non-linear Motion
Abstract: Modern methods solve multi-object tracking from two perspectives: motion modeling and appearance matching. As a classic paradigm, motion-based tracking by Kalman filters suffers from complicated motion patterns and the problem becomes more difficult when we only have noisy bounding boxes. To improve Kalman filter-based multi-object tracking in scenarios with complex motion, occlusion, and crossover, [...]
Improving Kalman Filter-based Multi-Object Tracking in Occlusion and Non-linear Motion
Abstract: Modern methods solve multi-object tracking from two perspectives: motion modeling and appearance matching. As a classic paradigm, motion-based tracking by Kalman filters suffers from complicated motion patterns and the problem becomes more difficult when we only have noisy bounding boxes. To improve Kalman filter-based multi-object tracking in scenarios with complex motion, occlusion, and crossover, [...]
Design Iteration of Dexterous Compliant Robotic Manipulators
Abstract: The goal of personal robotics is to have robots in homes performing everyday tasks efficiently to improve our quality of life. Towards this end, manipulators are needed which are low cost, safe around humans, and approach human-level dexterity. However, existing off-the-shelf manipulators are expensive both in cost and manufacturing time, difficult to repair, and [...]
Continual Learning of Compositional Skills for Robust Robot Manipulation
Abstract: Real world robots need to continuously learn new manipulation tasks in a lifelong learning manner. These new tasks often share many sub-structures e.g. sub-tasks, controllers, preconditions, with previously learned tasks. To utilize these shared sub-structures, we explore a compositional and object-centric approach to learn manipulation tasks. The first part of this thesis focuses on [...]
Towards Energy-Efficient Techniques and Applications for Universal AI Implementation
Abstract: The rapid advancement of large-scale language and vision models has significantly propelled the AI domain. We now see AI enriching everyday life in numerous ways – from community and shared virtual reality experiences to autonomous vehicles, healthcare innovations, and accessibility technologies, among others. Central to these developments is the real-time implementation of high-quality deep [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Watch, Practice, Improve: Towards In-the-wild Manipulation
Abstract: The longstanding dream of many roboticists is to see robots perform diverse tasks in diverse environments. To build such a robot that can operate anywhere, many methods train on robotic interaction data. While these approaches have led to significant advances, they rely on heavily engineered setups or high amounts of supervision, neither of which [...]
Structure-from-Motion Meets Self-supervised Learning
Abstract: How to teach machine to perceive 3D world from unlabeled videos? We will present new solution via incorporating Structure-from-Motion (SfM) into self-supervised model learning. Given RGB inputs, deep models learn to regress depth and correspondence. With the two inputs, we introduce a camera localization algorithm that searches for certified global optimal poses. However, the [...]
Combining Physics-Based Light Transport and Neural Fields for Robust Inverse Rendering
Abstract: Inverse rendering — the process of recovering shape, material, and/or lighting of an object or environment from a set of images — is essential for applications in robotics and elsewhere, from AR/VR to perception on self-driving vehicles. While it is possible to perform inverse rendering from color images alone, it is often far easier [...]
Improving the Transparency of Agent Decision Making to Humans Using Demonstrations
Abstract: For intelligent agents (e.g. robots) to be seamlessly integrated into human society, humans must be able to understand their decision making. For example, the decision making of autonomous cars must be clear to the engineers certifying their safety, passengers riding them, and nearby drivers negotiating the road simultaneously. As an agent's decision making depends [...]
Robotic Climbing for Extreme Terrain Exploration
Abstract: Climbing robots can operate in steep and unstructured environments that are inaccessible to other ground robots, with applications ranging from the inspection of artificial structures on Earth to the exploration of natural terrain features throughout the solar system. Climbing robots for planetary exploration face many challenges to deployment, including mass restrictions, irregular surface features, [...]
Layout Design for Large-Scale Multi-Robot Coordination
Abstract: Today, thousands of robots are navigating autonomously in warehouses, transporting goods from one location to another. While numerous planning algorithms are developed to coordinate robots more efficiently and robustly, warehouse layouts remain largely unchanged – they still adhere to the traditional pattern designed for human workers rather than robots. In this talk, I will [...]
Perception amidst interaction: spatial AI with vision and touch for robot manipulation
Abstract: Robots currently lack the cognition to replicate even a fraction of the tasks humans do, a trend summarized by Moravec's Paradox. Humans effortlessly combine their senses for everyday interactions—we can rummage through our pockets in search of our keys, and deftly insert them to unlock our front door. Before robots can demonstrate such dexterity, [...]
Toward Human-Centered XR: Bridging Cognition and Computation
Abstract: Virtual and Augmented Reality enables unprecedented possibilities for displaying virtual content, sensing physical surroundings, and tracking human behaviors with high fidelity. However, we still haven't created "superhumans" who can outperform what we are in physical reality, nor a "perfect" XR system that delivers infinite battery life or realistic sensation. In this talk, I will discuss some of our [...]
Carnegie Mellon Graphics Colloquium: C. Karen Liu : Building Large Models for Human Motion
Building Large Models for Human Motion Large generative models for human motion, analogous to ChatGPT for text, will enable human motion synthesis and prediction for a wide range of applications such as character animation, humanoid robots, AR/VR motion tracking, and healthcare. This model would generate diverse, realistic human motions and behaviors, including kinematics and dynamics, [...]
Teaching a Robot to Perform Surgery: From 3D Image Understanding to Deformable Manipulation
Abstract: Robot manipulation of rigid household objects and environments has made massive strides in the past few years due to the achievements in computer vision and reinforcement learning communities. One area that has taken off at a slower pace is in manipulating deformable objects. For example, surgical robotics are used today via teleoperation from a [...]
Zeros for Data Science
Abstract: The world around us is neither totally regular nor completely random. Our and robots’ reliance on spatiotemporal patterns in daily life cannot be over-stressed, given the fact that most of us can function (perceive, recognize, navigate) effectively in chaotic and previously unseen physical, social and digital worlds. Data science has been promoted and practiced [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Emotion perception: progress, challenges, and use cases
Abstract: One of the challenges Human-Centric AI systems face is understanding human behavior and emotions considering the context in which they take place. For example, current computer vision approaches for recognizing human emotions usually focus on facial movements and often ignore the context in which the facial movements take place. In this presentation, I will [...]
[MSR Thesis Talk] SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM
Abstract: Dense simultaneous localization and mapping (SLAM) is crucial for numerous robotic and augmented reality applications. However, current methods are often hampered by the non-volumetric or implicit way they represent a scene. This talk introduces SplaTAM, an approach that leverages explicit volumetric representations, i.e., 3D Gaussians, to enable high-fidelity reconstruction from a single unposed RGB-D [...]
Language: You’ve probably heard of it, read it, written it, gestured it, mimed it… Why can’t robots?
Abstract: Language is how meaning is conveyed between humans, and now the basis of foundation models. By implication, it's the most important modality for all of AGI and will replace the entire robotics control stack as the most important thing for all of us to work on.
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Foundation Models for Robotic Manipulation: Opportunities and Challenges
Abstract: Foundation models, such as GPT-4 Vision, have marked significant achievements in the fields of natural language and vision, demonstrating exceptional abilities to adapt to new tasks and scenarios. However, physical interaction—such as cooking, cleaning, or caregiving—remains a frontier where foundation models and robotic systems have yet to achieve the desired level of adaptability and [...]
Learning with Less
Abstract: The performance of an AI is nearly always associated with the amount of data you have at your disposal. Self-supervised machine learning can help – mitigating tedious human supervision – but the need for massive training datasets in modern AI seems unquenchable. Sometimes it is not the amount of data, but the mismatch of [...]
Human Perception of Robot Failure and Explanation During a Pick-and-Place Task
Abstract: In recent years, researchers have extensively used non-verbal gestures, such as head and arm movements, to express the robot's intentions and capabilities to humans. Inspired by past research, we investigated how different explanation modalities can aid human understanding and perception of how robots communicate failures and provide explanations during block pick-and-place tasks. Through an in-person [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Why We Should Build Robot Apprentices And Why We Shouldn’t Do It Alone
Abstract: For robots to be able to truly integrate human-populated, dynamic, and unpredictable environments, they will have to have strong adaptive capabilities. In this talk, I argue that these adaptive capabilities should leverage interaction with end users, who know how (they want) a robot to act in that environment. I will present an overview of [...]
Learning Distributional Models for Relative Placement
Abstract: Relative placement tasks are an important category of tasks in which one object needs to be placed in a desired pose relative to another object. Previous work has shown success in learning relative placement tasks from just a small number of demonstrations, when using relational reasoning networks with geometric inductive biases. However, such methods fail [...]
Robust Body Exposure (RoBE): A Graph-based Dynamics Modeling Approach to Manipulating Blankets over People
Abstract: Robotic caregivers could potentially improve the quality of life of many who require physical assistance. However, in order to assist individuals who are lying in bed, robots must be capable of dealing with a significant obstacle: the blanket or sheet that will almost always cover the person's body. We propose a method for targeted [...]
Exploration for Continually Improving Robots
Abstract: General purpose robots should be able to perform arbitrary manipulation tasks, and get better at performing new ones as they obtain more experience. The current paradigm in robot learning involves imitation or simulation. Scaling these approaches to learn from more data for various tasks is bottle-necked by human labor required either in collecting demonstrations [...]
Sparse-view 3D in the Wild
Abstract: Reconstructing 3D scenes and objects from images alone has been a long-standing goal in computer vision. We have seen tremendous progress in recent years, capable of producing near photo-realistic renderings from any viewpoint. However, existing approaches generally rely on a large number of input images (typically 50-100) to compute camera poses and ensure view [...]
Deep 3D Geometric Reasoning for Robot Manipulation
Abstract: To solve general manipulation tasks in real-world environments, robots must be able to perceive and condition their manipulation policies on the 3D world. These agents will need to understand various common-sense spatial/geometric concepts about manipulation tasks: that local geometry can suggest potential manipulation strategies, that policies should be invariant across choice of reference frame, [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Toward an ImageNet Moment for Synthetic Data
Abstract: Data, especially large-scale labeled data, has been a critical driver of progress in computer vision. However, many important tasks remain starved of high-quality data. Synthetic data from computer graphics is a promising solution to this challenge, but still remains in limited use. This talk will present our work on Infinigen, a procedural synthetic data [...]
Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World
Abstract: We show that imitating shortest-path planners in simulation produces Stretch RE-1 robotic agents that, given language instructions, can proficiently navigate, explore, and manipulate objects in both simulation and in the real world using only RGB sensors (no depth maps or GPS coordinates). This surprising result is enabled by our end-to-end, transformer-based, SPOC architecture, powerful [...]
Probabilistic 3D Multi-Object Cooperative Tracking for Autonomous Driving via Differentiable Multi-Sensor Kalman Filter
This talk has been postponed […]
Towards diverse zero-shot manipulation via actualizing visual plans
Abstract: In this thesis, we seek to learn a generalizable goal-conditioned policy that enables zero-shot robot manipulation — interacting with unseen objects in novel scenes without test-time adaptation. Robots that can be reliably deployed out-of-the-box in new scenarios have the potential for helping humans in everyday tasks. Not requiring any test-time training through demonstrations or [...]
Deep Learning for Sensors: Development to Deployment
Abstract: Robots rely heavily on sensing to reason about physical interactions, and recent advancements in rapid prototyping, MEMS sensing, and machine learning have led to a plethora of sensing alternatives. However, few of these sensors have gained widespread use among roboticists. This thesis proposes a framework for incorporating sensors into a robot learning paradigm, from [...]
Offline Learning for Stochastic Multi-Agent Planning in Autonomous Driving
Abstract: Fully autonomous vehicles have the potential to greatly reduce vehicular accidents and revolutionize how people travel and how we transport goods. Many of the major challenges for autonomous driving systems emerge from the numerous traffic situations that require complex interactions with other agents. For the foreseeable future, autonomous vehicles will have to share the [...]
Teruko Yata Memorial Lecture
Human-Centric Robots and How Learning Enables Generality Abstract: Humans have dreamt of robot helpers forever. What's new is that this dream is becoming real. New developments in AI, building on foundations of hardware and passive dynamics, enable vastly improved generality. Robots can step out of highly structured environments and become more human-centric: operating in human [...]
2024 Robotics Institute National Robotics Week Celebration Tours and Demos
April 12 1:00 - 4:00 pm: PUBLIC SPACE ROBOTS Open to the public TANK the roboceptionist Newell-Simon Hall 3rd floor entry area Meet Marion (Tank) LeFleur, Newell-Simon’s Roboceptionist. He’ll be glad to see you! The goal of the project is to produce a robot helpmate that is useful, exhibits social competence, and remains compelling to [...]
Creating robust deep learning models involves effectively managing nuisance variables
Abstract: Over the past decade, we have witnessed significant advances in capabilities of deep neural network models in vision and machine learning. However, issues related to bias, discrimination, and fairness in general, have received a great deal of negative attention (e.g., mistakes in surveillance and animal-human confusion of vision models). But bias in AI models [...]
Transfer Learning via Temporal Contrastive Learning Inbox
Abstract: This thesis introduces a novel transfer learning framework for deep reinforcement learning. The approach automatically combines goal-conditioned policies with temporal contrastive learning to discover meaningful sub-goals. The approach involves pre-training a goal-conditioned agent, finetuning it on the target domain, and using contrastive learning to construct a planning graph that guides the agent via sub-goals. Experiments [...]
Towards Influence-Aware Safe Human-Robot Interaction
Abstract: In recent years, we have seen through recommender systems on social media how influential (and potentially harmful) algorithms can be in our lives, sometimes creating polarization and conspiracies that lead to unsafe behavior. Now that robots are also growing more common in the real world, we must be very careful to ensure that they [...]
Learning to Manipulate beyond Imitation
Abstract: Imitation learning has been a prevalent approach for teaching robots manipulation skills but still suffers from scalability and generalizability. In this talk, I'll argue for going beyond elementary behavioral imitation from human demonstrations. Instead, I'll present two key directions: 1) Creating Manipulation Controllers from Pre-Trained Representations, and 2) Representing Video Demonstrations with Parameterized Symbolic [...]
Advanced robotics for manufacturing: challenges and opportunities
Abstract: Presenting projects with ARM Institute (including robot grinding, human-robot collaboration, and modularized manufacturing) and discussing some new opportunities in applying AI and robotics in manufacturing domain.
Improving Robot Capabilities Through Reconfigurability
Abstract: Advancements in robot capabilities are often achieved through integrating more hardware components. These hardware additions often lead to systems with high power consumption, fragility, and difficulties in control and maintenance. However, is this approach the only path to enhancing robot functionality? In this talk, I introduce the PuzzleBots, a modular multi-robot system with passive [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Reduced-Gravity Flights and Field Testing for Lunar and Planetary Rovers
Abstract: As humanity returns to the Moon and is developing outposts and related infrastructure, we need to understand how robots and work machines will behave in this harsh environment. It is challenging to find representative testing environments on Earth for Lunar and planetary rovers. To investigate the effects of reduced-gravity on interactions with granular terrains, [...]
Design Principles for Robotics Systems that Support Human-Human Collaborative Learning
Abstract: Robots possess unique affordances granted by combining software and hardware. Most existing research focuses on the impact of these affordances on human-robot collaboration, but the theory of how robots can facilitate human-human collaboration is underdeveloped. Such theory would be beneficial in education. An educational device can afford collaboration in both assembly and use. This [...]
Leveraging Parallelism to Accelerate Quadratic Program Solvers for MPC
Abstract: Many problems in robotics can be formulated as quadratic programs (QPs). In particular, model-predictive control problems often involve repeatedly solving QPs at very high rates (up to kilohertz). However, while other areas of robotics like machine learning have achieved high performance by taking advantage of parallelism on modern computing hardware, state-of-the-art algorithms for solving [...]
Shedding Light on 3D Cameras
Abstract: The advent (and commoditization) of low-cost 3D cameras is revolutionizing many application domains, including robotics, autonomous navigation, human computer interfaces, and recently even consumer devices such as cell-phones. Most modern 3D cameras (e.g., LiDAR) are active; they consist of a light source that emits coded light into the scene, i.e., its intensity is modulated over [...]
Robust Incremental Distributed Collaborative Simultaneous Localization and Mapping
Abstract: Multi-robot teams show exceptional promise across applications like Search-and-Rescue, disaster-response, agriculture, forestry, and scientific exploration due to their ability to go where humans cannot, parallelize activity, operate robustly to failures, and expand capabilities beyond that of an individual robot. Collaborative Simultaneous Localization and Mapping (C-SLAM) is a fundamental capability for these multi-robot teams as [...]
Towards Equitable Representation in Text-to-Image Generation
Abstract: Accurate representation in media is known to improve the well-being of the people who consume it. There is a growing concern about the increasing use of generative AI in media as the generative image models trained on large web-crawled datasets such as LAION are known to produce images with harmful stereotypes and misrepresentations of various groups, [...]