PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Multi-Human 3D Reconstruction from Monocular Videos

NSH 4305

Abstract: We study the problem of multi-human 3D reconstruction from videos captured in the wild. Human movements are dynamic, and accurately reconstructing them in various settings is crucial for developing immersive social telepresence, assistive humanoid robots, and augmented reality systems. However, creating such a system requires addressing fundamental issues with previous works regarding the data [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Language-Conditioned Object Detection and Manipulation

NSH 4305

Abstract: Traditional object detection methods are often confined to predefined object vocabularies, limiting their versatility in real-world scenarios where robots need to understand and execute diverse household tasks. Additionally, the 2D and 3D perception communities have typically pursued separate approaches tailored to their respective domains. In this thesis, we present a language-conditioned object detector with [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

How I Learned to Love Blobs: The Power of Gaussian Representations in Differentiable Rendering and Optimization

NSH 3305

Abstract: In this thesis, we explore the use of Gaussian Representations in multiple application areas of computer vision and robotics. In particular, we design a ray-based differentiable renderer for 3D Gaussians that can be used to solve multiple classic computer vision problems in a unified manner. For example, we can reconstruct 3D shapes from color, [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Watch, Practice, Improve: Towards In-the-wild Manipulation

NSH 3305

Abstract: The longstanding dream of many roboticists is to see robots perform diverse tasks in diverse environments. To build such a robot that can operate anywhere, many methods train on robotic interaction data. While these approaches have led to significant advances, they rely on heavily engineered setups or high amounts of supervision, neither of which [...]

VASC Seminar
Aayush Bansal
Startup

Generating Beautiful Pixels

Newell-Simon Hall 3305

Abstract: In this talk, I will present three experiments that use low-level image statistics to generate high-resolution detailed outputs. In the first experiment, I will use 2D pixels to efficiently mine hard examples for better learning. Simply biasing ray sampling towards hard ray examples enables learning of neural fields with more accurate high-frequency detail in less [...]

VASC Seminar
Viraj Prabhu
CS PhD Student
Georgia Institute of Technology

Towards Reliable Computer Vision Systems

Newell-Simon Hall 3305

Abstract:  The real world has infinite visual variation – across viewpoints, time, space, and curation. As deep visual models become ubiquitous in high-stakes applications, their ability to generalize across such variation becomes increasingly important. In this talk, I will present opportunities to improve such generalization at different stages of the ML lifecycle: first, I will [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Towards Photorealistic Dynamic Capture and Animation of Human Hair and Head

NSH 4305

Abstract: Realistic human avatars play a key role in immersive virtual telepresence. To reach a high level of realism, a human avatar needs to faithfully reflect human appearance. A human avatar should also be drivable and express natural motions. Existing works have made significant progress in building drivable realistic face avatars, but they rarely include [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Modeling Dynamic Clothing for Data-Driven Photorealistic Avatars

NSH 3305

Abstract: In this thesis, we aim to build photorealistic animatable avatars of humans wearing complex clothing in a data-driven manner. Such avatars will be a critical technology to enable future applications such as immersive telepresence in Virtual Reality (VR) and Augmented Reality (AR). Existing full-body avatars that jointly model geometry and view-dependent texture using Variational [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Manipulation Among Movable Objects for Pick-and-Place Tasks in Cluttered 3D Workspaces

NSH 1305

Abstract: In cluttered real-world workspaces, simple pick-and-place tasks for robot manipulators can be quite challenging to solve. Often there is no collision-free trajectory that allows the robot to grasp and extract a desired object from the scene. This requires motion planning algorithms to reason about rearranging some of the “movable” clutter in the scene so [...]

RI Seminar
Paul Debevec
Chief Research Officer
Eyeline Studios

Transforming Hollywood Visual Effects with Graphics and Vision

3305 Newell-Simon Hall

Abstract: Paul will describe his path to developing visual effects technology used in hundreds of movies, including The Matrix, Spider-Man 2, Benjamin Button, Avatar, Maleficent, Furious 7, and Blade Runner: 2049. These techniques include image-based modeling and rendering, high dynamic range imaging, image-based lighting, and high-resolution facial scanning for photoreal digital actors. Paul will also [...]

VASC Seminar
Bharath Hariharan
Assistant Professor
Cornell University

Vision without labels

3305 Newell-Simon Hall

Abstract: Deep learning has revolutionized all aspects of computer vision, but its successes have come from supervised learning at scale: large models trained on ever larger labeled datasets. However this reliance on labels makes these systems fragile when it comes to new scenarios or new tasks where labels are unavailable. This is in stark contrast to [...]

Faculty Events

RI Faculty Business Meeting

Newell-Simon Hall 4305

Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.

RI Seminar
Shuran Song
Assistant Professor
Robotics and Embodied AI Lab, Stanford University

Learning Meets Gravity: Robots that Learn to Embrace Dynamics from Data

1305 Newell Simon Hall

Abstract: Despite the incredible capabilities (speed and repeatability) of our hardware today, many robot manipulators are deliberately programmed to avoid dynamics – moving slow enough so they can adhere to quasi-static assumptions of the world. In contrast, people frequently (and subconsciously) make use of dynamic phenomena to manipulate everyday objects – from unfurling blankets, to [...]

VASC Seminar
Yong Jae Lee
Associate Professor
Department of Computer Sciences , University of Wisconsin-Madison

Large Multimodal (Vision-Language) Models for Image Generation and Understanding

Newell-Simon Hall 3305

Abstract: Large Language Models and Large Vision Models, also known as Foundation Models, have led to unprecedented advances in language understanding, visual understanding, and AI. In particular, many computer vision problems including image classification, object detection, and image generation have benefited from the capabilities of such models trained on internet-scale text and visual data. In [...]

Faculty Events

RI Faculty Business Meeting

Newell-Simon Hall 4305

Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.

RI Seminar
Fei Miao
Associate Professor
Department of Computer Science & Engineering, University of Connecticut

Learning and Control for Safety, Efficiency, and Resiliency of Embodied AI

1305 Newell Simon Hall

Abstract: The rapid evolution of ubiquitous sensing, communication, and computation technologies has revolutionized of cyber-physical systems (CPS) across virous domains like robotics, smart grids, aerospace, and smart cities. Integrating learning into dynamic systems control presents significant Embodied AI opportunities. However, current decision-making frameworks lack comprehensive understanding of the tridirectional relationship among communication, learning and control, [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Generalizable Dexterity with Reinforcement Learning

GHC 4405

Abstract: Dexterity, the ability to perform complex interactions with the physical world, is at the core of robotics. However, existing research in robot manipulation has been focused on tasks that involve limited dexterity, such as pick-and-place. The motor skills of the robots are often quasi-static, have a predefined or limited sequence of contact events, and [...]

VASC Seminar
Mohamed Elhoseiny
Assistant Professor
Computer Science, KAUST

Imaginative Vision Language Models: Towards human-level imaginative AI skills transforming species discovery, content creation, self-driving cars, and emotional health

3305 Newell-Simon Hall

Abstract:   Most existing AI learning methods can be categorized into supervised, semi-supervised, and unsupervised methods. These approaches rely on defining empirical risks or losses on the provided labeled and/or unlabeled data. Beyond extracting learning signals from labeled/unlabeled training data, we will reflect in this talk on a class of methods that can learn beyond the vocabulary [...]

VASC Seminar
Kenneth Marino
Research Scientist
Google DeepMind

World Knowledge in the Time of Large Models

Newell-Simon Hall 3305

Abstract:  This talk will discuss the massive shift that has come about in the vision and ML community as a result of the large pre-trained language and language and vision models such as Flamingo, GPT-4, and other models. We begin by looking at the work on knowledge-based systems in CV and robotics before the large model [...]

RI Seminar
Marc Deisenroth
DeepMind Chair of Machine Learning and Artificial Intelligence
University College London

Data-Efficient Learning for Robotics and Reinforcement Learning

1305 Newell Simon Hall

Abstract: Data efficiency, i.e., learning from small datasets, is of practical importance in many real-world applications and decision-making systems. Data efficiency can be achieved in multiple ways, such as probabilistic modeling, where models and predictions are equipped with meaningful uncertainty estimates, transfer learning, or the incorporation of valuable prior knowledge. In this talk, I will [...]

Faculty Events

RI Faculty Business Meeting

Newell-Simon Hall 4305

Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.

VASC Seminar
Shunsuke Saito
Research Scientist
Meta Reality Labs Research

Digital Human Modeling with Light

Newell-Simon Hall 3305

Abstract: Leveraging light in various ways, we can observe and model physical phenomena or states which may not be possible to observe otherwise. In this talk, I will introduce our recent exploration on digital human modeling with different types of light. First, I will present our recent work on the modeling of relightable human heads, [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Preference Based Optimization of Multi-Objective Robot Performance

NSH 4305

Abstract: Robotic systems often require that tradeoffs be made--for example, between performance and robustness, power and longevity, or efficiency and safety. While roboticists can design cost functions with hand-picked weights for different metrics, it is not always a straightforward task, particularly when some aspects of performance are not easily quantified. This can occur especially when [...]

VASC Seminar
Jonathon Luiten
Postdoctoral Fellow
RWTH Aachen and Carnegie Mellon University

Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis

Newell-Simon Hall 3305

Abstract: We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements. We follow an analysis-by-synthesis framework, inspired by recent work that models scenes as a collection of 3D Gaussians which are optimized to reconstruct input images via differentiable rendering. To model [...]

Faculty Events

RI Faculty Business Meeting

Newell-Simon Hall 4305

Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Ensuring safety for uncertain high-dimensional robotic systems

GHC 8102

Abstract: Two major obstacles for safe control and planning are (1) scaling to high-dimensional systems and (2) handling uncertain systems. This is problematic because such systems are ubiquitous in practice: e.g. drones with unknown drag, manipulators carrying unknown packages. In this proposal, we aim to address both challenges. At the control level, we have synthesized [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Trustworthy Learning using Uncertain Interpretation of Data

GHC 8102

Abstract: Non-parametric models are popular in real-world applications of machine learning. However, many modern ML methods that ensure that models are pragmatic, safe, robust, fair, and otherwise trustworthy in increasingly critical applications, assume parametric, differentiable models. We show that, by interpreting data as locally uncertain, we can achieve many of these without being limited to [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Allocation, Planning, and Control in Off-road Automated Convoy Operations

GHC 4405

Abstract: The lack of structure in off-road terrains makes off-road operations of automated platforms difficult. The difficulty arises from uncertainty in the optimality and safety of the actions (e.g., planning and control) taken by the automated platform. When multiple automated platforms are required to act in a coordinated manner (e.g., a convoy) in complex cluttered [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Robot Learning for Assistive Dressing

NSH 4305

Abstract: Robot-assisted dressing could benefit the lives of many people such as older adults and individuals with disabilities. In this talk, I will present two pieces of work that use robot learning for this assistive task. In the first half of the talk, I will present our work on developing a robot-assisted dressing system that [...]

Faculty Events
Senior Systems Scientist
Robotics Institute,
Carnegie Mellon University

RI Faculty Meeting: Multi-Robot Field Autonomy: A 5 Year Perspective

Newell-Simon Hall 4305

LIVE DEMO! Come see, hear and witness progress made in developing a heterogeneous (wheeled, legged, etc.) team of field deployable mobile robots.  Details will be shared on the history of development of multi-robot autonomy at CMU throughout the previous DARPA Subterranean Challenge, DARPA RACER program, and current ARL projects.  There will be an ongoing live and interactive [...]

Faculty Events

RI Faculty Business Meeting

Newell-Simon Hall 4305

Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.

RI Seminar
Dr. Robert Ambrose
J. Mike Walker '66 Chair Professor
Mechanical Engineering, Texas A&M University

Robots at the Johnson Space Center and Future Plans

1305 Newell Simon Hall

Abstract: The seminar will review a series of robotic systems built at the Johnson Space Center over the last 20 years. These will include wearable robots (exoskeletons, powered gloves and jetpacks), manipulation systems (ISS cranes down to human scale) and lunar mobility systems (human surface mobility and robotic rovers). As all robotics presentations should, this [...]

VASC Seminar
Arun Ross
Professor
Michigan State University

Biometrics in a Deep Learning World

Newell-Simon Hall 3305

Abstract: Biometrics is the science of recognizing individuals based on their physical and behavioral attributes such as fingerprints, face, iris, voice and gait. The past decade has witnessed tremendous progress in this field, including the deployment of biometric solutions in diverse applications such as border security, national ID cards, amusement parks, access control, and smartphones. [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Towards Robotic Tree Manipulation: Leveraging Graph Representations

GHC 4405

Abstract: There is growing interest in automating agricultural tasks that require intricate and precise interaction with specialty crops, such as trees and vines. However, developing robotic solutions for crop manipulation remains a difficult challenge due to complexities involved in modeling their deformable behavior. In this study, we present a framework for learning the deformation behavior [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Tracking Any”Thing” in Videos

NSH 3001

Abstract: Being able to track anything is one of the fundamental steps to parse and understand a video. In this talk, I will present two pieces of work that tackle this problem at different spatial granularities. In the first half of the talk, I will discuss tracking any video pixel or particle through time in [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Exploring Diverse Interaction Types for Human in the Loop Robot Learning

NSH 4305

Abstract: Teaching sessions between humans and robots will need to be maximally informative for optimal robot learning and to ease the human’s teaching burden. However, the bulk of prior work considers one or two modalities through which a human can convey information to a robot—namely, kinesthetic demonstrations and preference queries. Moreover, people will teach robots [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Learning Generalizable Robot Skills for Dynamic and Interactive Tasks

GHC 4405

Abstract: Enabling robots to perform complex dynamic tasks such as picking up an object in one sweeping motion or pushing off a wall to quickly turn a corner is a challenging problem. The dynamic interactions implicit in these tasks are critical for successful task execution. Furthermore, given the interactive nature of such tasks, safety, in [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Customizing Large-scale Text-to-Image Models

NSH 4305

Abstract: Advancements in large-scale generative models represent a watershed moment. These models can generate a wide variety of objects and scenes with different styles and compositions. However, these models are trained on a fixed snapshot of available data and often contain copyrighted or private images. This assumption makes them lacking in two aspects – (a) [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Building Robot Hands and Teaching Dexterity

NSH 4305

Abstract: Our shared dream is to have robot humanoids with hands complete similar tasks that humans do. While there are a few robot hands available today, the popular opinion is that they are difficult to use, expensive, and hard to obtain which precludes their ubiquitous usage. We argue that this is not an inherent problem [...]

VASC Seminar
Andrea Tagliasacchi
Associate Professor
Simon Fraser University

Neural World Models

Newell-Simon Hall 4305

Abstract: Computer vision researchers have pushed the limits of performance in perception tasks involving natural images to near saturation. With self-supervised inference driven by recent advancements in generative modeling, it can be debated that the era of large image models is coming to a close, ushering in an era focused on video. However, it's worth [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

How to Design Robotic Hands That Wield Tools

NSH 1305

Abstract: Tool manipulation is an essential human skill. It extends our manipulation capability beyond the capability of the biological hand, and is a defining feature of many important jobs centered on physical interaction with the real world. Yet, wielding a tool is drastically different from generally grasping an object. The prime examples are pens and [...]

RI Seminar
Chien-Ming Huang
John C. Malone Assistant Professor
Department of Computer Science, Johns Hopkins University

Becoming Teammates: Designing Assistive, Collaborative Machines

1305 Newell Simon Hall

Abstract:  The growing power in computing and AI promises a near-term future of human-machine teamwork. In this talk, I will present my research group’s efforts in understanding the complex dynamics of human-machine interaction and designing intelligent machines aimed to assist and collaborate with people. I will focus on 1) tools for onboarding machine teammates and [...]

Special Events

Robotics Institute Winter Party

Newell-Simon Hall Perlis Atrium

Please join us for some fun, food, beverages and conversation! All RI faculty, staff, students and visitors are invited to the Robotics Institute Winter Party! We apologize but due to space limitations in the Atrium we regretfully cannot include family or other non-RI guests.

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Learning Local Heuristics in Heuristic Search

NSH 3305

Abstract: Motion planning is a fundamental problem in robotics; how can we move robots efficiently and safely? Motion planning can be solved using several paradigms with their own strengths and weaknesses. This talk dives into Heuristic Graph Search and its application to motion planning by converting it to a problem of finding a start-goal path [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Low-Cost Multimodal Sensing and Dexterity for Deformable Object Manipulation

GHC 6115

Abstract: To integrate robots seamlessly into daily life, they must be able to handle a variety of tasks in diverse environments, like assisting in hospitals or cooking in kitchens. Many of the items in these environments are deformable such as bedding in hospitals or vegetables in kitchens, and a certain level of dexterity is necessary [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Joint 2D and 3D Semi-Supervised Object Detection

NSH 4305

Abstract: While numerous 3D detection works leverage the complementary relationship between RGB images and point clouds, developments in the broader framework of semi-supervised object recognition remain uninfluenced by multi-modal fusion. Current methods develop independent pipelines for 2D and 3D semi-supervised learning despite the availability of paired image and point cloud frames. Observing that the distinct [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

New Methods for Satellite Control

NSH 1109

Abstract: Since 2003, the number of satellites launched into orbit has grown from 100 per year to over 2000 per year. Over that same timeframe, incredible advances have been made in control systems for terrestrial robotics and autonomy. Despite the increased quantity of satellites in orbit and the advances made in terrestrial control systems, satellite [...]

MSR Thesis Defense
MSR Student
Robotics Institute,
Carnegie Mellon University

[MSR Thesis Talk] Development and Testing of a Software Stack for an Autonomous Racing Vehicle

3305 Newell-Simon Hall

Abstract: Autonomous racing aims to replicate the human racecar driver with software and sensors. As in traditional motorsports, Autonomous Racing Vehicles (ARVs) are pushed to their dynamic limits in multi-agent scenarios at high (>= 100mph) speeds. This Operational Design Domain (ODD) presents unique challenges across the autonomy stack. The Indy Autonomous Challenge (IAC) is an [...]

MSR Thesis Defense
MSR Student
Robotics Institute,
Carnegie Mellon University

[MSR Thesis Talk] Kitchen Robot Case Studies: Learning Manipulation Tasks from Human Video Demonstrations

GHC 8102

Abstract:  The vision of integrating a robot into the kitchen, capable of acting as a chef, remains a sought-after goal in robotics. Current robotic systems, mostly programmed for specific tasks, fall short in versatility and adaptability to a diverse culinary environment. While significant progress has been made in robotic learning, with advancements in behavior cloning, [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Towards Agile Robotics: Creating Push-Off Skills for Dynamic Interactions

GHC 8102

Abstract: Dynamic interactions play a fundamental role in human capabilities, enabling us to achieve a wide range of tasks such as moving heavy objects, manipulating our surroundings, and changing directions rapidly and safely. In contrast, most conventional robotic systems lack this level of agility and cannot perform dynamic interactions, limiting their potential in practical applications. [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Learning Safe Human-Robot Interactions for a Seamlessly Shared Airspace

NSH 3305

Abstract: The growing need for fully autonomous aerial operations in shared spaces, necessitates the development of reliable agents capable of navigating safely and seamlessly alongside uncertain human agents. In response, we advocate endowing autonomous agents with the ability to predict human actions, comprehend and ground abstract rules in the action space, and embrace the uncertainty [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Generative Evolutionary Search with Diffusion Models for Trajectory Optimization

NSH 4305

Abstract: Diffusion models excel at modeling complex and multimodal trajectory distributions for decision-making and control. Reward-gradient guided denoising has been recently proposed to generate trajectories that maximize both a differentiable reward function and the likelihood under the data distribution captured by a diffusion model. Reward-gradient guided denoising requires a differentiable reward function fitted to both [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Tartancalib: Iterative Wide-Angle Lens Calibration

GHC 8115

Abstract: Mobile vision systems greatly benefit from the large field-of-view enabled by wide-angle lenses. Accurate and robust intrinsic calibration is a critical prerequisite for leveraging this property. Calibrating wide-angle lenses with current state-of-the-art techniques yields poor results due to extreme distortion at the edge. In this work, we present TartanCalib, an accurate and robust method [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Sample-Efficient Reinforcement Learning with applications in Nuclear Fusion

NSH 4305

Abstract: In many practical applications of reinforcement learning (RL), it is expensive to observe state transitions from the environment. In the problem of plasma control for nuclear fusion, the motivating example of this thesis, determining the next state for a given state-action pair requires querying an expensive transition function which can lead to many hours [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

[MSR Thesis Talk] Neural Implicit Representations for Medical Ultrasound Volumes and 3D Anatomy-specific Reconstructions

GHC 4405

Abstract: Most Robotic Ultrasound Systems (RUSs) equipped with ultrasound-interpreting algorithms rely on building 3D reconstructions of the entire scanned region or specific anatomies. These 3D reconstructions are typically created via methods that compound or stack 2D tomographic ultrasound images using known poses of the ultrasound transducer with the latter requiring 2D or 3D segmentation. While fast, this class [...]

PhD Thesis Defense
Extern
Robotics Institute,
Carnegie Mellon University

Social Navigation with Pedestrian Groups

NSH 4305

Abstract: Autonomous navigation in human crowds (i.e., social navigation) presents several challenges: The robot often needs to rely on its noisy sensors to identify and localize pedestrians in human crowds; the robot needs to plan efficient paths to reach its goals; the robot needs to do so in a safe and socially appropriate manner. Recent [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Zero-Shot Video Question Answering with Procedural Programs

GHC 6121

Abstract: We propose to answer zero-shot questions about videos by generating short procedural programs that derive a final answer from solving a sequence of visual subtasks. We present Procedural Video Querying (ProViQ), which uses a large language model to generate such programs from an input question and an API of visual modules in the prompt, [...]

Faculty Events

RI Faculty Business Meeting

Newell-Simon Hall 4305

Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.

MSR Thesis Defense
MSR Student
Robotics Institute,
Carnegie Mellon University

[MSR Thesis Talk] Enhancing RHex Robot Performance with Innovative Bioplastic Legs Responsive to Humidity

GHC 4405

Abstract: Designing and developing robots that can effectively navigate real-world environments poses a significant challenge. To overcome this, many robotic systems draw inspiration from the adaptive behaviors of animals, which have evolved to thrive in diverse surroundings. Amphibious animals, for instance, seamlessly transition between walking and swimming, optimizing their locomotion efficiency based on environmental cues. [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Informative Path Planning Toward Autonomous Real-World Applications

GHC 8102

Abstract: Gathering information from the physical world plays a crucial role in many applications—whether it be scientific research, environmental monitoring, search and rescue, defense, or disaster response. The utilization of robots for information gathering allows for the leveraging of intelligent algorithms to efficiently collect data, providing critical insights and facilitating informed decision-making. These autonomous robots [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Alignment for Vision-Language Foundation Model

NSH 3305

Abstract: Recent advancements in vision-language foundation models, exemplified by GPT4-Vision and DALL-E 3, have significantly transformed both research and practical applications, ranging from professional assistance to content creation. However, aligning them precisely with specific user goals presents a notable challenge. This thesis introduces innovative strategies for improving this alignment. I will first introduce our novel [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Efficient Sensor Coverage in Complex Environments

Abstract: This thesis develops sensor coverage algorithms for mobile robots that are scalable to large and complex environments. The core challenge is computing the shortest paths that can direct one or more robots to sweep onboard sensors over all accessible surfaces within an environment. This problem resembles the watchman route problem that is known to [...]

VASC Seminar
Ce Zheng
Ph.D. candidate at Center for Research in Computer Vision
University of Central Florida

Reconstructing 3D Humans from Visual Data

Newell-Simon Hall 3305

Abstract:  Abstract: Understanding humans in visual content is fundamental for numerous computer vision applications. Extensive research has been conducted in the field of human pose estimation (HPE) to accurately locate joints and construct body representations from images and videos. Expanding on HPE, human mesh recovery (HMR) addresses the more complex task of estimating the 3D pose [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Improving Kalman Filter-based Multi-Object Tracking in Occlusion and Non-linear Motion

Newell-Simon Hall 4305

Abstract: Modern methods solve multi-object tracking from two perspectives: motion modeling and appearance matching. As a classic paradigm, motion-based tracking by Kalman filters suffers from complicated motion patterns and the problem becomes more difficult when we only have noisy bounding boxes. To improve Kalman filter-based multi-object tracking in scenarios with complex motion, occlusion, and crossover, [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Improving Kalman Filter-based Multi-Object Tracking in Occlusion and Non-linear Motion

NSH 4305

Abstract: Modern methods solve multi-object tracking from two perspectives: motion modeling and appearance matching. As a classic paradigm, motion-based tracking by Kalman filters suffers from complicated motion patterns and the problem becomes more difficult when we only have noisy bounding boxes. To improve Kalman filter-based multi-object tracking in scenarios with complex motion, occlusion, and crossover, [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Design Iteration of Dexterous Compliant Robotic Manipulators

GHC 6501

Abstract: The goal of personal robotics is to have robots in homes performing everyday tasks efficiently to improve our quality of life. Towards this end, manipulators are needed which are low cost, safe around humans, and approach human-level dexterity. However, existing off-the-shelf manipulators are expensive both in cost and manufacturing time, difficult to repair, and [...]