Learning generative representations for image distributions
Abstract: Autoencoder neural networks are an unsupervised technique for learning representations, which have been used effectively in many data domains. While capable of generating data, autoencoders have been inferior to other models like Generative Adversarial Networks (GAN’s) in their ability to generate image data. We will describe a general autoencoder architecture that addresses this limitation, and [...]
Carnegie Mellon University
Self-Supervising Occlusions for Vision
Abstract: Virtually every scene has occlusions. Even a scene with a single object exhibits self-occlusions - a camera can only view one side of an object (left or right, front or back), or part of the object is outside the field of view. More complex occlusions occur when one or more objects block part(s) of [...]
Carnegie Mellon University
Development of an Agile and Dexterous Balancing Mobile Manipulator Robot
Abstract: This thesis focuses on designing and controlling a dynamically stable shape-accelerating dual-arm mobile manipulator, the Carnegie Mellon University (CMU) ballbot. The CMU ballbot is a human-sized dynamically stable mobile robot that balances on a single spherical wheel. We describe the development of a pair of seven-degree-of-freedom (DOF) humanoid arms. The new 7-DOF arm pair [...]
Carnegie Mellon University
Massively Parallelized Lazy Planning Algorithms
Abstract: Search-based planning algorithms enable autonomous agents like robots to come up with well-reasoned long horizon plans to achieve a given task objective. They do so by optimizing a task-specific cost function while respecting the constraints on either the agent (e.g. motion constraints) or the environment (e.g. obstacles). In robotics, such as in motion planning [...]
Building Intelligent and Visceral Machines: From Sensing to Application
Abstract: Humans have evolved to have highly adaptive behaviors that help us survive and thrive. As AI prompts a move from computing interfaces that are explicit and procedural to those that are implicit and intelligent, we are presented with extraordinary opportunities. In this talk, I will argue that understanding affective and behavioral signals presents many opportunities [...]
GANcraft – an unsupervised 3D neural method for world-to-world translation
Abstract: Advances in 2D image-to-image translation methods, such as SPADE/GauGAN, have enabled users to paint photorealistic images by drawing simple sketches similar to those created in Microsoft Paint. Despite these innovations, creating a realistic 3D scene remains a painstaking task, out of the reach of most people. It requires years of expertise, professional software, a library [...]
Run-Time Optimization in the Deep Learning Age
Abstract: In a recovery task one seeks to obtain an estimate of an unknown signal from a set of incomplete measurements. These problems arise in a number of computer vision applications, from image based tasks such as super-resolution and in-painting to 3D reconstruction tasks such as Non-Rigid Structure from Motion and scene flow estimation. Early [...]
Carnegie Mellon University
System Identification and Control of Multiagent Systems Through Interactions
Abstract: This thesis investigates the problem of identifying dynamics models of individual agents of a multiagent system (MAS) and exploiting these models to shape their behavior using robots extrinsic to the MAS. While task-based control of a MAS using onboard controllers of its agents is well studied, we investigate (a) how easy it is for [...]
Human-in-the-loop Model Creation
Abstract: Modern machine learning systems have made astonishing progress in automating labor-intensive tasks such as visual recognition and machine translation. While ML systems complete these tasks better and faster, humans are largely left behind. Indeed, most humans are entirely excluded from the creation process of machine learning models, except for tedious data annotation. In [...]
Carnegie Mellon University
Learning and Inference in Factor Graphs with Applications to Tactile Perception
Abstract: Factor graphs offer a flexible and powerful framework for solving large-scale, nonlinear inference problems as encountered in robot perception and control. Typically, these methods rely on handcrafted models that are efficient to optimize. However, robots often perceive the world through complex, high-dimensional sensor observations. For instance, consider a robot manipulating an object in hand [...]
Learning Optical Flow: Model, Data, and Applications
Abstract: Optical flow provides important information about the dynamic world and is of fundamental importance to many tasks. In this talk, I will present my work on different aspects of learning optical flow. I will start with the background and talk about PWC-Net, a compact and effective model built using classical principles for optical flow. Next, [...]
Hands-On Interactions
Abstract: Our sense of touch is present in almost all our interactions with the world, from providing us with the feedback necessary to perceive and manipulate objects without having to look at them, to allowing our limbs to move and walk without us having to think about how to take the next step. We use [...]
Distributed Dissipativity: Applying Foundational Stability Theory to Modern Networked Control
Abstract: Despite its diverse areas of application, the desire to optimize performance and guarantee acceptable behaviour in the face of inevitable uncertainty is pervasive throughout control theory. This creates a fundamental challenge since the necessity of robustly stable control schemes often favors conservative designs, while the desire to optimize performance typically demands the opposite. While [...]
Towards Complex Robot Motions with Reinforcement Learning
Abstract: Reinforcement learning has shown to be a powerful tool for decision-making problems. In this talk, we present the opportunities and challenges of enabling increasingly complex robot behavior with reinforcement learning. First, we present a system that combines reinforcement learning and extrinsic dexterity to solve a novel task of “occluded grasping”. To reach an occluded [...]
Haptic Perspective-taking from Vision and Force
Abstract: Physically collaborative robots present an opportunity to positively impact society across many domains. However, robots currently lack the ability to infer how their actions physically affect people. This is especially true for robotic caregiving tasks that involve manipulating deformable cloth around the human body, such as dressing and bathing assistance. In this talk, I [...]
Do Vision-Language Pretrained Models Learn Spatiotemporal Primitive Concepts?
Abstract: Vision-language models pretrained on web-scale data have revolutionized deep learning in the last few years. They have demonstrated strong transfer learning performance on a wide range of tasks, even under the "zero-shot" setup, where text "prompts" serve as a natural interface for humans to specify a task, as opposed to collecting labeled data. These models are [...]
RI Council Meeting
RI Council is a leadership group made up of the Director of RI, Academic Program Leads, Committee Chairs, and members at large as appointed by the Director. RI Council meets generally once a week to discuss department business.
Perception-Action Synergy in Uncertain Environments
Abstract: Many robotic applications require a robot to operate in an environment with unknowns or uncertainty, at least initially, before it gathers enough information about the environment. In such a case, a robot must rely on sensing and perception to feel its way around. Moreover, it has to couple sensing/perception and motion synergistically in real [...]
Carnegie Mellon University
Driving Reconfigurable Unmanned Vehicle Design for Mobility Performance
Abstract: Unmanned ground vehicles are being deployed in increasingly diverse and complex environments. Advances in the field of robotics, including perception technology, computing power, and machine learning, have brought robots from the lab to the real world. Remote and autonomous vehicles are now used to explore volcanoes, caves, pipes, war zones, disaster sites, and even [...]
Max-Affine Spline Insights into Deep Learning
Abstract: We build a rigorous bridge between deep networks (DNs) and approximation theory via spline functions and operators. Our key result is that a large class of DNs can be written as a composition of max-affine spline operators (MASOs) that provide a powerful portal through which we view and analyze their inner workings. For instance, [...]
Search-based Path Planning for a High Dimensional Manipulator in Cluttered Environments Using Optimization-based Primitives
Abstract: In this work we tackle the path planning problem for a 21-dimensional snake robot-like, navigating a cluttered gas turbine for the purposes of inspection. Heuristic search-based approaches are effective planning strategies for common manipulation domains. However, their performance on high-dimensional systems is heavily reliant on the effectiveness of the action space and the heuristics [...]
Vision-Based Tactile Sensor Design using Physics Based Rendering
Abstract: Tactile sensing has seen a rapid adoption with the advent of vision-based tactile sensors. Vision-based tactile sensors provide high resolution, compact and inexpensive data to perform precise in-hand manipulation and human-robot interaction. However, the simulation of tactile sensors is still a challenge. Simulation is a critical tool in the development of robotic systems. In [...]
Teruko Yata Memorial Lecture
Leveraging Language and Video Demonstrations for Learning Robot Manipulation Skills and Enabling Closed-Loop Task Planning Humans have gradually developed language, mastered complex motor skills, created and utilized sophisticated tools. The act of conceptualization is fundamental to these abilities because it allows humans to mentally represent, summarize and abstract diverse knowledge and skills. By means of [...]
Details to Follow . . .
Details to Follow . . .
Robotics Institute Staff Offices 12PM Early Dismissal
Dear RI Faculty and Staff, In observance of the coming holiday, institute staff offices will close Friday, April 15th at noon. They will reopen at 8:30 on Monday, April 18th. Happy Holiday! Thank you – Debbie Z. =================================================== Deborah H. Zalewski, Senior Associate Business Manager | The Robotics Institute - Carnegie Mellon University | Newell-Simon [...]
RI Hiring Meeting
A faculty hiring meeting to discuss candidates for faculty position
Carnegie Mellon University
Unified Simulation, Perception, and Generation of Human Behavior
Abstract: Understanding and modeling human behavior is fundamental to almost any computer vision and robotics applications that involve humans. In this thesis, we take a holistic approach to human behavior modeling and tackle its three essential aspects --- simulation, perception, and generation. Throughout the thesis, we show how the three aspects are deeply connected and [...]
Kernel Density Decision Trees
Abstract We propose kernel density decision trees (KDDTs), a novel fuzzy decision tree (FDT) formalism based on kernel density estimation that improves the robustness of decision trees and ensembles and offers additional utility. FDTs mitigate the sensitivity of decision trees to uncertainty by representing uncertainty through fuzzy partitions. However, compared to conventional, crisp decision trees, [...]
Energy-based Joint Pose Estimation for 3D Reconstruction
Abstract: In this talk, I will describe a data-driven method for inferring camera poses given a sparse collection of images of an arbitrary object. This task is a core component of classic geometric pipelines such as structure-from-motion (SFM), and also serves as a vital pre-processing requirement for contemporary neural approaches (e.g. NeRF) to object reconstruction. [...]
NeRF for Robotics
Abstract: In this talk I'll describe how recent advances in neural rendering and novel view synthesis - namely NeRF - can be leveraged by robotic agents to improve performance in manipulation tasks. Specifically, I'll argue that NeRF can enable robotic policies to: (1) generalize to new viewpoints; (2) perceive specular and reflective surfaces in a [...]