Teruko Yata Memorial Lecture
Leveraging Language and Video Demonstrations for Learning Robot Manipulation Skills and Enabling Closed-Loop Task Planning Humans have gradually developed language, mastered complex motor skills, created and utilized sophisticated tools. The act of conceptualization is fundamental to these abilities because it allows humans to mentally represent, summarize and abstract diverse knowledge and skills. By means of [...]
Details to Follow . . .
Details to Follow . . .
Robotics Institute Staff Offices 12PM Early Dismissal
Dear RI Faculty and Staff, In observance of the coming holiday, institute staff offices will close Friday, April 15th at noon. They will reopen at 8:30 on Monday, April 18th. Happy Holiday! Thank you – Debbie Z. =================================================== Deborah H. Zalewski, Senior Associate Business Manager | The Robotics Institute - Carnegie Mellon University | Newell-Simon [...]
RI Hiring Meeting
A faculty hiring meeting to discuss candidates for faculty position
Carnegie Mellon University
Unified Simulation, Perception, and Generation of Human Behavior
Abstract: Understanding and modeling human behavior is fundamental to almost any computer vision and robotics applications that involve humans. In this thesis, we take a holistic approach to human behavior modeling and tackle its three essential aspects --- simulation, perception, and generation. Throughout the thesis, we show how the three aspects are deeply connected and [...]
Kernel Density Decision Trees
Abstract We propose kernel density decision trees (KDDTs), a novel fuzzy decision tree (FDT) formalism based on kernel density estimation that improves the robustness of decision trees and ensembles and offers additional utility. FDTs mitigate the sensitivity of decision trees to uncertainty by representing uncertainty through fuzzy partitions. However, compared to conventional, crisp decision trees, [...]
Energy-based Joint Pose Estimation for 3D Reconstruction
Abstract: In this talk, I will describe a data-driven method for inferring camera poses given a sparse collection of images of an arbitrary object. This task is a core component of classic geometric pipelines such as structure-from-motion (SFM), and also serves as a vital pre-processing requirement for contemporary neural approaches (e.g. NeRF) to object reconstruction. [...]
NeRF for Robotics
Abstract: In this talk I'll describe how recent advances in neural rendering and novel view synthesis - namely NeRF - can be leveraged by robotic agents to improve performance in manipulation tasks. Specifically, I'll argue that NeRF can enable robotic policies to: (1) generalize to new viewpoints; (2) perceive specular and reflective surfaces in a [...]
Carnegie Mellon University
Search Algorithms and Search Spaces for Neural Architecture Search
Abstract: Neural architecture search (NAS) is recently proposed to automate the process of designing network architectures. Instead of manually designing network architectures, NAS automatically finds the optimal architecture in a data-driven way. Despite its impressive progress, NAS is still far from being widely adopted as a common paradigm for architecture design in practice. This thesis [...]
Carnegie Mellon University
MSR Thesis Talk – Evan Harber
Title: Stiffness Mapping of Deformable Objects Through Supervised Embedding and Gaussian Process Regression Abstract: The stiffness map of a deformable object stores information about that object's surface compliance. Thus, through a stiffness map, we gain insight into the physical properties of that object. Depending on the object, an understanding of stiffness has applications ranging [...]
Designing Robotic Systems with Collective Embodied Intelligence
Abstract: Natural swarms exhibit sophisticated colony-level behaviors with remarkable scalability and error tolerance. Their evolutionary success stems from more than just intelligent individuals, it hinges on their morphology, their physical interactions, and the way they shape and leverage their environment. Mound-building termites, for instance, are believed to use their own body as a template for [...]
MSR Thesis Talk – Gaurav Parmar
Title: Spatially-Adaptive Multilayer GAN Inversion Abstract: Existing GAN inversion and editing methods are well suited for only a target images that contain aligned objects with a clean background, such as portraits and animal faces, but often struggle for more difficult categories with complex scene layouts and object occlusions, such as cars, animals, and outdoor images. [...]
Robust Reinforcement Learning via Genetic Curriculum
Abstract: Achieving robust performance is crucial when applying deep reinforcement learning (RL) in safety critical systems. Some of the state of the art approaches try to address the problem with adversarial agents, but these agents often require expert supervision to fine tune and prevent the adversary from becoming too challenging to the trainee agent. While [...]
Mouth Haptics in VR using a Headset Ultrasound Phased Array
Abstract: This talk is the same one I will be presenting at the ACM CHI Conference on Human Factors in Computing Systems on May 2nd. Paper abstract: Today’s consumer virtual reality (VR) systems offer limited haptic feedback via vibration motors in handheld controllers. Rendering haptics to other parts of the body is an open challenge, [...]
Towards Large-scale and Long-term Neural Map Representations
Abstract: We address the problem of large-scale and long-term neural map representations. Maps, as our prior understanding toward the environment, provide valuable information for modern robotics applications such as autonomous driving and AR/VR. The size of maps largely affects the end task performance: usually a more detailed map can support better performance, but would cost [...]
Carnegie Mellon University
Self-Improving 3D Scene Representations
Abstract: Most computer vision models in deployment today are not continually learning. Instead, they are in a “test” mode, where they will behave the same way perpetually, until they are replaced by newer models. This is a problem, because it means the models may perform poorly as soon as their “test” environment diverges from their [...]
Carnegie Mellon University
MSR Thesis Talk – Manash Pratim Das
Title: Model-Accuracy Aware Anytime Planning with Simulation Verification for Navigating Complex Terrains Abstract: Off-road and unstructured environments often contain complex patches of various types of terrain, rough elevation changes, deformable objects, etc. An autonomous ground vehicle traversing such environments experiences physical interactions that are extremely hard to model at scale and thus very hard to [...]
Carnegie Mellon University
MSR Thesis Talk – Akshay Dharamavaram
Title: Stabilizing the Training Dynamics of Generative Models using Self-Supervision Abstract: Generative Models have been shown to be adept in mimicking the behavior of an unknown distribution solely from bootstrapped data. However, deep learning models have been shown to overfit in either the minimization or maximization stage of the two player min-max game, resulting [...]
Carnegie Mellon University
Direct-drive Hands: Making Robot Hands Transparent and Reactive to Contacts
Abstract: Industrial manipulators and end-effectors are a vital driver of the automation revolution. These robot hands, designed to reject disturbances with stiffness and strength, are inferior to their human counterparts. Human hands are dexterous and nimble effectors capable of a variety of interactions with the environment. Through this thesis we wish to answer a question: [...]
Carnegie Mellon University
MSR Thesis Talk – Vivek Roy
Title: Smartphone localization for Indoor Pedestrian Navigation Abstract: Global positioning system (GPS) interfacing with applications such as Google Maps has proven very useful for navigation in outdoor open settings. However in crowded metropolitan environments with high rise buildings or in indoor settings, GPS quickly becomes unreliable. Using sensors found on commodity smartphones to perform accurate [...]
Understanding 3D Scenes and Interacting Hands
Abstract: Abstract: The long-term goal of my research is to help computers understand the physical world from images, including both 3D properties and how humans or robots could interact with things. This talk will summarize two recent directions aimed at enabling this goal. I will begin with learning to reconstruct full 3D scenes, including [...]
Manipulating Objects with Challenging Visual and Geometric Properties
Abstract: Object manipulation is a well-studied domain in robotics, yet manipulation remains difficult for objects with visually and geometrically challenging properties. Visually challenging properties, such as transparency and specularity, break assumptions of Lambertian reflectance that existing methods rely on for grasp estimation. On the other hand, deformable objects such as cloth pose both visual and [...]
TIGRIS: An Informed Sampling-based Algorithm for Informative Path Planning
Abstract: In this talk I will present our sampling-based approach to informative path planning that allows us to tackle the challenges of large and high-dimensional search spaces. This is done by performing informed sampling in the high-dimensional continuous space and incorporating potential information gain along edges in the reward estimation. This method rapidly generates a [...]
Carnegie Mellon University
MSR Thesis Talk – Zhe Huang
Title: Distributed Reinforcement Learning for Autonomous Driving Abstract: Due to the complex and safety-critical nature of autonomous driving, recent works typically test their ideas on simulators designed for the very purpose of advancing self-driving research. Despite the convenience of modeling autonomous driving as a trajectory optimization problem, few of these methods resort to online reinforcement [...]
Carnegie Mellon University
MSR Thesis Talk- Xinjie Yao
Title: Ride Comfort-Aware Visual Navigation via Self-Supervised Learning Abstract: Under shared autonomy, wheelchair users expect vehicles to provide safe and comfortable rides while following users’ high-level navigation plans. To find such a path, vehicles negotiate with different terrains and assess their traversal difficulty. Most prior works model surroundings either through geometric representations or semantic classifications, [...]
MS Thesis Talk – Shun Iwase
Title: Fast 6D Object Pose Refinement via Deep Texture Rendering Abstract: We present RePOSE, a fast iterative refinement method for 6D object pose estimation. Prior methods perform refinement by feeding zoomed-in input and rendered RGB images into a CNN and directly regressing an update of a refined pose. Their runtime is slow due to the [...]
Carnegie Mellon University
Resource-Constrained Learning and Inference for Visual Perception
Abstract: We have witnessed rapid advancement across major computer vision benchmarks over the past years. However, the top solutions' hidden computation cost prevents them from being practically deployable. For example, training large models until convergence may be prohibitively expensive in practice, and autonomous driving or augmented reality may require a reaction time that rivals that [...]
Trajectory Optimization for Thermally-Actuated Soft Planar Robot Limbs
Abstract: Practical use of robotic manipulators made from soft materials requires generating and executing complex motions. We present the first approach for generating trajectories of a thermally-actuated soft robotic manipulator. Based on simplified approximations of the soft arm and its antagonistic shape-memory alloy actuator coils, we justify a dynamics model of a discretized rigid manipulator [...]
Carnegie Mellon University
Physical Interaction and Manipulation of the Environment using Aerial Robots
Abstract: The physical interaction of aerial robots with their environment has countless potential applications and is an emerging area with many open challenges. Fully-actuated multirotors have been introduced to tackle some of these challenges. They provide complete control over position and orientation and eliminate the need for attaching a multi-DoF manipulation arm to the robot. [...]
Time-of-Flight Radiance Fields for Dynamic Scene View Synthesis
Abstract: Neural networks can represent and accurately reconstruct radiance fields for static 3D scenes (e.g., NeRF). Several works extend these to dynamic scenes captured with monocular video, with promising performance. However, the monocular setting is known to be an under-constrained problem, and so methods rely on data-driven priors for reconstructing dynamic content. We replace these [...]
Snakes & Spiders, Robots & Geometry
Abstract: Locomotion and perception are a common thread between robotics and biology. Understanding these phenomena at a mechanical level involves nonlinear dynamics and the coordination of many degrees of freedom. In this talk, I will discuss geometric approaches to organizing this information in two problem domains: Undulatory locomotion of snakes and swimmers, and vibration propagation [...]
Combining vision-based tactile, proximity, and global sensing for robotic manipulation
Abstract: I will begin by describing our work on visual servoing a manipulator and localizing objects using a robot-mounted suite of vision and vision-based tactile sensors, our results, algorithms used, and lessons learned. We show that by collocating tactile, and global (e.g. an RGB(D) camera) sensors, our setup can perform better than using each type [...]
Carnegie Mellon University
Visual Representation and Recognition without Human Supervision
Abstract: The advent of deep learning based artificial perception models has revolutionized the field of computer vision. These methods take advantage of the ever growing computational capacity of machines and the abundance of human-annotated data to build supervised learners for a wide-range of visual tasks. However, the reliance on human-annotated is also a bottleneck for [...]
Design, Modeling and Control for a Tilt-rotor VTOL UAV in the Presence of Actuator Failure
Abstract: Providing both the vertical take-off and landing capabilities and the ability to fly long distances to aircraft opens the door to a wide range of new real-world aircraft applications while improving many existing applications. Tiltrotor vertical take-off and landing (VTOL) unmanned aerial vehicles (UAVs) are a better choice than fixed-wing and multirotor aircraft for [...]
Large Scale Dense 3D Reconstruction via Sparse Representations
Abstract: Scene reconstruction systems take in (3D) videos as input, and output 3D models with associated poses for inputs. With the demand of 3D content generation, the technique has been drastically evolving in recent years. For professionals equipped with depth sensors, efficient dense reconstruction systems have become available to efficiently recover scene geometry. For ordinary [...]
Carnegie Mellon University
Learning Multi-Modal Navigation in Unstructured Environments
Abstract: A robot that operates efficiently in a team with humans in an unstructured outdoor environment must translate commands into actions from a modality intuitive to its operator. The robot must be able to perceive the world as humans do so that the actions taken by the robot reflect the nuances of natural language and [...]
Lessons Learned from Creating Low-Cost Dexterous Soft Robot Hands
Abstract: Soft robot hands have shown promising results when it comes to dexterous grasping and manipulation. Compared to their rigid counterparts, soft hands can be manufactured for a fraction of the cost and offer robustness to uncertainty due to their inherent compliance. Unfortunately, the design and fabrication of soft robot hands is still a time-consuming [...]
Modern Trajectory Forecasting Methods Lack Social Awareness
Abstract: We present a thorough evaluation and analysis of state-of-the-art (SOTA) human trajectory forecasting methods with respect to metrics for safe and socially-aware prediction, e.g., collision rate, in addition to traditional displacement metrics, e.g., average displacement error. First, we introduce a system for trajectory classification which is used to evaluate the strengths and weaknesses of [...]
Carnegie Mellon University
Vision-based Aircraft Detection and Tracking for Detect-and-Avoid
Abstract: Detect-and-Avoid (DAA) capabilities are critical for autonomous operations of small unmanned aircraft systems (sUAS). Traditionally DAA systems for large aircraft have been ground and radar-based. Due to the size, weight, and power (SWaP) constraints of sUAS, current DAA systems rely mainly on vision-based sensors and ADS-B (Automatic Dependent Surveillance-Broadcast) transponders. However, not all flying [...]
Teaching Agent Reward Functions via Demonstrations for Human Inverse Reinforcement Learning
Abstract: For intelligent agents (e.g. robots) to be seamlessly integrated into human society, humans must be able to understand their decision making. For example, the decision making of autonomous cars must be clear to the engineers certifying their safety, passengers riding them, and nearby drivers negotiating the road simultaneously. As an agent's decision making can [...]
RI Council Meeting
RI Council is a leadership group made up of the Director of RI, Academic Program Leads, Committee Chairs, and members at large as appointed by the Director. RI Council meets generally once a week to discuss department business.
Learning to perform dynamic and interactive tasks using structural and algorithmic priors
Abstract: Everyday human tasks such as picking up an object in one smooth motion, pushing a heavy door using the momentum of our bodies or pushing off a wall to quickly turn a corner involve complex dynamic interactions between the human and the environment, as well as switching dynamics when the robot makes and breaks [...]
The Robotics Institute Semi-formal
All Robotics Institute faculty, students, visitors and staff are invited with to attend. One guest per person. RSVP required. Please check your emails for the e-vite and RSVP link. Please contact Debbie Tobin, dmz@cs.cmu.edu, with any questions.
Simple Shape Descriptors for Retinal Surface Estimation using a Laser-Aiming Beam
Abstract: Retinal surgery procedures like epiretinal membrane peeling and retinal vein cannulation require surgeons to manipulate very delicate structures in the eye with little room for error. Many robotic surgery systems have been developed to help surgeons and enforce safeguards during these demanding procedures. One essential piece of information that is required to create and [...]
Affective Robot Behavior Improves Learning in a Sorting Game
Abstract: Nonverbal communication in the field of education can allow teachers to emotionally support their students and improve educational experience and performance. Robot nonverbal movements have been shown to improve both subjective experiences and task performance, and this work investigates whether affective robot behavior can improve human learning. This is tested using an online sorting [...]
Policy Decomposition: Approximate Optimal Control with Suboptimality Estimates
Abstract: Optimal Control is a formulation for designing controllers for dynamical systems by posing it as an optimization problem, whereby the desired long-term behavior of the system is expressed using a cost function. The objective is to compute a policy, i.e. a mapping from the state of the system to its control inputs, that minimizes [...]