Student Talks
Zero-Shot Video Question Answering with Procedural Programs
Abstract: We propose to answer zero-shot questions about videos by generating short procedural programs that derive a final answer from solving a sequence of visual subtasks. We present Procedural Video Querying (ProViQ), which uses a large language model to generate such programs from an input question and an API of visual modules in the prompt, [...]
[MSR Thesis Talk] Enhancing RHex Robot Performance with Innovative Bioplastic Legs Responsive to Humidity
Abstract: Designing and developing robots that can effectively navigate real-world environments poses a significant challenge. To overcome this, many robotic systems draw inspiration from the adaptive behaviors of animals, which have evolved to thrive in diverse surroundings. Amphibious animals, for instance, seamlessly transition between walking and swimming, optimizing their locomotion efficiency based on environmental cues. [...]
Informative Path Planning Toward Autonomous Real-World Applications
Abstract: Gathering information from the physical world plays a crucial role in many applications—whether it be scientific research, environmental monitoring, search and rescue, defense, or disaster response. The utilization of robots for information gathering allows for the leveraging of intelligent algorithms to efficiently collect data, providing critical insights and facilitating informed decision-making. These autonomous robots [...]
Alignment for Vision-Language Foundation Model
Abstract: Recent advancements in vision-language foundation models, exemplified by GPT4-Vision and DALL-E 3, have significantly transformed both research and practical applications, ranging from professional assistance to content creation. However, aligning them precisely with specific user goals presents a notable challenge. This thesis introduces innovative strategies for improving this alignment. I will first introduce our novel [...]
Efficient Sensor Coverage in Complex Environments
Abstract: This thesis develops sensor coverage algorithms for mobile robots that are scalable to large and complex environments. The core challenge is computing the shortest paths that can direct one or more robots to sweep onboard sensors over all accessible surfaces within an environment. This problem resembles the watchman route problem that is known to [...]
Improving Kalman Filter-based Multi-Object Tracking in Occlusion and Non-linear Motion
Abstract: Modern methods solve multi-object tracking from two perspectives: motion modeling and appearance matching. As a classic paradigm, motion-based tracking by Kalman filters suffers from complicated motion patterns and the problem becomes more difficult when we only have noisy bounding boxes. To improve Kalman filter-based multi-object tracking in scenarios with complex motion, occlusion, and crossover, [...]
Improving Kalman Filter-based Multi-Object Tracking in Occlusion and Non-linear Motion
Abstract: Modern methods solve multi-object tracking from two perspectives: motion modeling and appearance matching. As a classic paradigm, motion-based tracking by Kalman filters suffers from complicated motion patterns and the problem becomes more difficult when we only have noisy bounding boxes. To improve Kalman filter-based multi-object tracking in scenarios with complex motion, occlusion, and crossover, [...]
Design Iteration of Dexterous Compliant Robotic Manipulators
Abstract: The goal of personal robotics is to have robots in homes performing everyday tasks efficiently to improve our quality of life. Towards this end, manipulators are needed which are low cost, safe around humans, and approach human-level dexterity. However, existing off-the-shelf manipulators are expensive both in cost and manufacturing time, difficult to repair, and [...]
Continual Learning of Compositional Skills for Robust Robot Manipulation
Abstract: Real world robots need to continuously learn new manipulation tasks in a lifelong learning manner. These new tasks often share many sub-structures e.g. sub-tasks, controllers, preconditions, with previously learned tasks. To utilize these shared sub-structures, we explore a compositional and object-centric approach to learn manipulation tasks. The first part of this thesis focuses on [...]
Watch, Practice, Improve: Towards In-the-wild Manipulation
Abstract: The longstanding dream of many roboticists is to see robots perform diverse tasks in diverse environments. To build such a robot that can operate anywhere, many methods train on robotic interaction data. While these approaches have led to significant advances, they rely on heavily engineered setups or high amounts of supervision, neither of which [...]