Faculty Candidate Talk: Jason Ma
Title: Internet Supervision for Robot Learning Abstract: The availability of internet-scale data has led to impressive large-scale AI models in various domains, such as vision and language. For learning robot skills, despite recent efforts in crowd-sourcing robot data, robot-specific datasets remain orders of magnitude smaller. Rather than focusing on scaling robot data, my research takes the alternative path of directly [...]
Experience-Based Action Advising for Multi-Agent Teaming
Abstract: We study how to improve coordination efficiency for multi-agent teams with heterogeneously experienced agents. In such a setting, experienced agents can transfer their knowledge to less experienced agents to accelerate their learning, while leveraging the students' initial expertise to inform what knowledge to transfer. Inspired by this idea, this work specifically assumes one teacher [...]
Towards Controllable Sampling and Diverse Score Distillation in Diffusion Models
Abstract: Denoising diffusion models have emerged as a powerful paradigm for generative modeling, which has been widely used for perception, generation, and action. These models can be utilized through sampling or score distillation; however, existing methods lack controllability in sampling and suffer from limited diversity in score distillation. In this thesis, we propose two complementary mechanisms to enhance the [...]
RESCUE Rollers: A Platform for Collaborative, Multi-robot Exploration in Search and Rescue
Abstract: The use of robotic platforms for search and rescue remains a significant challenge for many roboticist. While human and animal first responders play critical roles, their effectiveness can be limited by biological constraints. Robotic systems offer the potential to overcome these limitations, especially in environments inaccessible to humans and animals due to size or [...]
Scaling, Automating and Adapting Sim-to-real Policy Learning
Abstract: Building a generalist robot capable of performing diverse tasks in unstructured environments remains a longstanding challenge. A recent trend in robot learning aims to address this by scaling up demonstration datasets for imitation learning. However, most large-scale robotics datasets are collected in the real-world, often via manual teleoperation. This process is labor-intensive, slow, hardware-dependent, [...]
Generative 3D Garment Modeling with Sparse Visual Cues
Abstract: As digital apparel becomes increasingly vital to virtual environments and personalized experiences, there is a growing need for intuitive tools that enable non-experts to create and interact with 3D garments. To broaden accessibility, these tools must function effectively with minimal input - raising the key question: How can we achieve high-quality 3D garment modeling [...]
Towards Pragmatic Time Series Intelligence
Abstract: This thesis aims to democratize time series intelligence by making advanced modeling capabilities accessible to users without specialized machine learning knowledge. We pursue this goal through three complementary contributions that build foundation models, improve our understanding of them, and address challenges emerging in their practical use. We start by introducing MOMENT, the first family [...]
Underwater 3D Visual Perception and Generation
Abstract: With modern robotic technologies, seafloor imagery has become more accessible to researchers and the public. This thesis leverages deep learning and 3D vision techniques to deliver valuable information from seafloor image observations collected by robotic platforms. Despite the widespread use of deep learning and 3D vision algorithms across various fields, underwater imaging presents unique [...]
Autonomous Exploration and Navigation, Full Autonomy System, and Beyond
Abstract: In this talk, I will present work on autonomous exploration and introduce our full autonomy system. The work started several years ago from lidar-based state estimation. Building upon the state estimation module, the autonomy system now contains multiple fundamental modules, e.g. collision avoidance, terrain traversability analysis, and waypoint following. At the high level of [...]
3D Understanding for Zero-Shot Task-Oriented Grasping via Grounding Symbolic Representations
Abstract: Task-oriented grasping requires robots to reason not only about object geometry, but also about the function and semantics of object parts in context. While large language models (LLMs) offer powerful commonsense knowledge, they lack grounding in physical geometry. This talk explores how symbolic object representations can bridge that gap, enabling LLMs to guide grasp [...]