Teaching a Robot to Perform Surgery: From 3D Image Understanding to Deformable Manipulation
Abstract: Robot manipulation of rigid household objects and environments has made massive strides in the past few years due to the achievements in computer vision and reinforcement learning communities. One area that has taken off at a slower pace is in manipulating deformable objects. For example, surgical robotics are used today via teleoperation from a [...]
Zeros for Data Science
Abstract: The world around us is neither totally regular nor completely random. Our and robots’ reliance on spatiotemporal patterns in daily life cannot be over-stressed, given the fact that most of us can function (perceive, recognize, navigate) effectively in chaotic and previously unseen physical, social and digital worlds. Data science has been promoted and practiced [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Emotion perception: progress, challenges, and use cases
Abstract: One of the challenges Human-Centric AI systems face is understanding human behavior and emotions considering the context in which they take place. For example, current computer vision approaches for recognizing human emotions usually focus on facial movements and often ignore the context in which the facial movements take place. In this presentation, I will [...]
[MSR Thesis Talk] SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM
Abstract: Dense simultaneous localization and mapping (SLAM) is crucial for numerous robotic and augmented reality applications. However, current methods are often hampered by the non-volumetric or implicit way they represent a scene. This talk introduces SplaTAM, an approach that leverages explicit volumetric representations, i.e., 3D Gaussians, to enable high-fidelity reconstruction from a single unposed RGB-D [...]
Language: You’ve probably heard of it, read it, written it, gestured it, mimed it… Why can’t robots?
Abstract: Language is how meaning is conveyed between humans, and now the basis of foundation models. By implication, it's the most important modality for all of AGI and will replace the entire robotics control stack as the most important thing for all of us to work on.
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Foundation Models for Robotic Manipulation: Opportunities and Challenges
Abstract: Foundation models, such as GPT-4 Vision, have marked significant achievements in the fields of natural language and vision, demonstrating exceptional abilities to adapt to new tasks and scenarios. However, physical interaction—such as cooking, cleaning, or caregiving—remains a frontier where foundation models and robotic systems have yet to achieve the desired level of adaptability and [...]
Learning with Less
Abstract: The performance of an AI is nearly always associated with the amount of data you have at your disposal. Self-supervised machine learning can help – mitigating tedious human supervision – but the need for massive training datasets in modern AI seems unquenchable. Sometimes it is not the amount of data, but the mismatch of [...]