[MSR Thesis Talk] Development and Testing of a Software Stack for an Autonomous Racing Vehicle
Abstract: Autonomous racing aims to replicate the human racecar driver with software and sensors. As in traditional motorsports, Autonomous Racing Vehicles (ARVs) are pushed to their dynamic limits in multi-agent scenarios at high (>= 100mph) speeds. This Operational Design Domain (ODD) presents unique challenges across the autonomy stack. The Indy Autonomous Challenge (IAC) is an [...]
[MSR Thesis Talk] Kitchen Robot Case Studies: Learning Manipulation Tasks from Human Video Demonstrations
Abstract: The vision of integrating a robot into the kitchen, capable of acting as a chef, remains a sought-after goal in robotics. Current robotic systems, mostly programmed for specific tasks, fall short in versatility and adaptability to a diverse culinary environment. While significant progress has been made in robotic learning, with advancements in behavior cloning, [...]
Towards Agile Robotics: Creating Push-Off Skills for Dynamic Interactions
Abstract: Dynamic interactions play a fundamental role in human capabilities, enabling us to achieve a wide range of tasks such as moving heavy objects, manipulating our surroundings, and changing directions rapidly and safely. In contrast, most conventional robotic systems lack this level of agility and cannot perform dynamic interactions, limiting their potential in practical applications. [...]
Learning Safe Human-Robot Interactions for a Seamlessly Shared Airspace
Abstract: The growing need for fully autonomous aerial operations in shared spaces, necessitates the development of reliable agents capable of navigating safely and seamlessly alongside uncertain human agents. In response, we advocate endowing autonomous agents with the ability to predict human actions, comprehend and ground abstract rules in the action space, and embrace the uncertainty [...]
Generative Evolutionary Search with Diffusion Models for Trajectory Optimization
Abstract: Diffusion models excel at modeling complex and multimodal trajectory distributions for decision-making and control. Reward-gradient guided denoising has been recently proposed to generate trajectories that maximize both a differentiable reward function and the likelihood under the data distribution captured by a diffusion model. Reward-gradient guided denoising requires a differentiable reward function fitted to both [...]
Tartancalib: Iterative Wide-Angle Lens Calibration
Abstract: Mobile vision systems greatly benefit from the large field-of-view enabled by wide-angle lenses. Accurate and robust intrinsic calibration is a critical prerequisite for leveraging this property. Calibrating wide-angle lenses with current state-of-the-art techniques yields poor results due to extreme distortion at the edge. In this work, we present TartanCalib, an accurate and robust method [...]
Sample-Efficient Reinforcement Learning with applications in Nuclear Fusion
Abstract: In many practical applications of reinforcement learning (RL), it is expensive to observe state transitions from the environment. In the problem of plasma control for nuclear fusion, the motivating example of this thesis, determining the next state for a given state-action pair requires querying an expensive transition function which can lead to many hours [...]
[MSR Thesis Talk] Neural Implicit Representations for Medical Ultrasound Volumes and 3D Anatomy-specific Reconstructions
Abstract: Most Robotic Ultrasound Systems (RUSs) equipped with ultrasound-interpreting algorithms rely on building 3D reconstructions of the entire scanned region or specific anatomies. These 3D reconstructions are typically created via methods that compound or stack 2D tomographic ultrasound images using known poses of the ultrasound transducer with the latter requiring 2D or 3D segmentation. While fast, this class [...]
Social Navigation with Pedestrian Groups
Abstract: Autonomous navigation in human crowds (i.e., social navigation) presents several challenges: The robot often needs to rely on its noisy sensors to identify and localize pedestrians in human crowds; the robot needs to plan efficient paths to reach its goals; the robot needs to do so in a safe and socially appropriate manner. Recent [...]
Zero-Shot Video Question Answering with Procedural Programs
Abstract: We propose to answer zero-shot questions about videos by generating short procedural programs that derive a final answer from solving a sequence of visual subtasks. We present Procedural Video Querying (ProViQ), which uses a large language model to generate such programs from an input question and an API of visual modules in the prompt, [...]