Student Talks
Towards a Robot Generalist through In-Context Learning and Abstractions
Abstract: The goal of this thesis is to discover AI processes that enhance cross-domain and cross-task generalization in intelligent robot agents. Unlike the dominant approach in contemporary robot learning, which pursues generalization primarily through scaling laws (increasing data and model size), we focus on identifying the best abstractions and representations in both perception and policy [...]
Vision-based Human Motion Modeling and Analysis
Abstract: Modern computer vision has achieved remarkable success in tasks such as detecting, segmenting, and estimating the pose of humans in images and videos, reaching or even surpassing human-level performance. However, they still face significant challenges in predicting and analyzing future human motion. This thesis explores how vision-based solutions can enhance the fidelity and accuracy [...]
Recent Progress in Graph-Search Methods for Multi-Robot-Arm Motion Planning
Abstract: An exciting frontier in robotic manipulation is the use of multiple arms at once. However, planning concurrent motions is a challenging task using current methods. A major obstacle is the high-dimensional state space of this planning problem, which renders many traditional motion planning algorithms impractical. This opens the door for alternatives to the common [...]
Physical Process-Informed Mapping for Robotic Exploration
Abstract: Mobile robots used for information gathering tasks rely on dense, predictive mapping of large-scale regions to determine where to take measurements. Current approaches to mapping commonly rely on Gaussian process regression to spatially correlate data, extrapolate from sparse samples, and estimate uncertainty. However, these approaches do not incorporate meaningful information about physical processes that [...]
Moving Lights and Cameras for Better 3D Perception of Indoor Scenes
Abstract: Decades of research on computer vision have highlighted the importance of active sensing -- where an agent controls the parameters of the sensors to improve perception. Research on active perception in the context of robotic manipulation has demonstrated many novel and robust sensing strategies involving a multitude of sensors like RGB and RGBD cameras [...]
Learning to create 3D content
Abstract: With the popularity of Virtual Reality (VR), Augmented Reality (AR), and other 3D applications, developing methods that let everyday users capture and create their own 3D content has become increasingly essential. Current 3D creation pipelines often require either tedious manual effort or specialized setups with densely captured views. Additionally, many resulting 3D models are [...]
Trustworthy Learning using Uncertain Interpretation of Data
Abstract: Motivated by the potential of Artificial Intelligence (AI) in high-cost and safety-critical applications, and recently also by the increasing presence of AI in our everyday lives, Trustworthy AI has grown in prominence as a broad area of research encompassing topics such as interpretability, robustness, verifiable safety, fairness, privacy, accountability, and more. This has created [...]
VoxDet: Voxel Learning for Novel Instance Detection
Abstract: Detecting unseen instances based on multi-view templates is a challenging problem due to its open-world nature. Traditional methodologies, which primarily rely on 2D representations and matching techniques, are often inadequate in handling pose variations and occlusions. To solve this, we introduce VoxDet, a pioneer 3D geometry-aware framework that fully utilizes the strong 3D voxel [...]
Voxel Learning for Novel Instance Detection
Abstract: Detecting unseen instances based on multi-view templates is a challenging problem due to its open-world nature. Traditional methodologies, which primarily rely on 2D representations and matching techniques, are often inadequate in handling pose variations and occlusions. To solve this, we introduce VoxDet, a pioneer 3D geometry-aware framework that fully utilizes the strong 3D voxel [...]
Sensorimotor-Aligned Design for Pareto-Efficient Haptic Immersion in Extended Reality
Abstract: A new category of computing devices is emerging: augmented and virtual reality headsets, collectively referred to as extended reality (XR). These devices can alter, augment, or even replace our reality. While these headsets have made impressive strides in audio-visual immersion over the past half-century, XR interactions remain almost completely absent of appropriately expressive tactile [...]