Motion planning for manipulation under pose uncertainty using contacts
Abstract: Numerous manipulation tasks, such as plug insertion and pipe assembly, demand an extremely high level of precision in pose estimation. Even minor errors, on the order of 2mm, can lead to task failure. While robots often rely on vision for object detection and localization, achieving consistent, high-precision localization using visual methods is not always [...]
Robust Off-road Wheel Odometry with Slip Estimation
Abstract: Wheel odometry is not often used in state estimation for off-road vehicles due to frequent wheel slippage, varying wheel radii, and the 3D motion of the vehicle not fitting with the 2D nature of integrated wheel odometry. This paper proposes a novel 3D preintegration of wheel encoder measurements on manifold. Our method additionally estimates [...]
Composable Optimization for Robotic Motion Planning and Control
Abstract: Contact interactions are pervasive in real-world robotics tasks like manipulation and walking. However, the non-smooth dynamics associated with impacts and friction remain challenging to model, and motion planning and control algorithms that can fluently and efficiently reason about contact remain elusive. In this talk, I will share recent work from my research group that takes an “optimization-first” [...]
Optimal Modular Robot Design for Mobile Manipulation in Agriculture
Abstract: Although agriculture is a highly mechanized industry, numerous sectors like horticulture and floriculture heavily depend on manual labor because they require safe handling of plants and produce that can only be left to humans. However, many research and commercial robots have succeeded in several challenging dexterous manipulation tasks like harvesting, pruning, and plant health [...]
Aligning Robot Task and Interaction Policies to Human Values
Abstract: The value alignment problem considers how robots can learn to behave in accordance with human values. Today, robot learning paradigms enable humans to provide data (e.g., preference labels or demonstrations), which the robot uses to update its behavior (e.g., reward model or policy) to be closer to the human’s values. However, the current paradigm [...]
Learned Imaging Systems
Abstract: Computational imaging systems are based on the joint design of optics and associated image reconstruction algorithms. Of particular interest in recent years has been the development of end-to-end learned “Deep Optics” systems that use differentiable optical simulation in combination with backpropagation to simultaneously learn optical design and deep network post-processing for applications such as hyperspectral [...]
Accelerating Robot Task Learning with Large Pretrained Models and Internet Data
Abstract: Large pre-trained models and internet data sources are key to general and efficient robot task learning. However, learning contact-rich behaviors, semantic task constraints, and robust task planning from internet data sources remains an open challenge. This proposal seeks to make progress towards a general robot task learning system leveraging pre-trained models and internet data. [...]
A Modularized Approach to Vision-based Tactile Sensor Design Using Physics-based Rendering
Abstract: Touch is an essential sensing modality for making autonomous robots more dexterous and allowing them to work collaboratively with humans. In particular, the advent of vision-based tactile sensors has resulted in efforts to design them for different robotic manipulation tasks. However, this design task remains a challenging problem. This is for two reasons: first, [...]
Towards Universal Place Recognition
Title: Towards Universal Place Recognition Abstract: Place Recognition is essential for achieving robust robot localization. However, current state-of-art systems remain environment/domain-specific and fragile. By leveraging insights from vision foundation models, we present AnyLoc, a universal VPR solution that performs across diverse environments without retraining or fine-tuning, significantly outperforming supervised baselines. We further introduce MultiLoc, and enable [...]
Enhancing Model Performance and Interpretability with Causal Inference as a Feature Selection Algorithm
Abstract: Causal inference focuses on uncovering cause-effect relationships from data, diverging from conventional machine learning which primarily relies on correlation analysis. By identifying these causal relationships, causal inference improves feature selection for predictive models, leading to predictions that are more accurate, interpretable, and robust. This approach proves especially effective with interventional data, such as randomized [...]
ARPA-H and America’s Health: Pursuing High-Risk/High-Reward Research to Improve Health Outcomes for All
Dr. Andy Kilianski will provide an overview of ARPA-H, a new U.S. government funding agency pursuing R&D for health challenges. He will review the unique niche occupied by ARPA-H within the Department of Health and Human Services and how ARPA-H is already partnering with academia and industry to transform health outcomes across the country. Discussion [...]
GNSS-denied Ground Vehicle Localization for Off-road Environments with Bird’s-eye-view Synthesis
Abstract: Global localization is essential for the smooth navigation of autonomous vehicles. To obtain accurate vehicle states, on-board localization systems typically rely on Global Navigation Satellite System (GNSS) modules for consistent and reliable global positioning. However, GNSS signals can be obstructed by natural or artificial barriers, leading to temporary system failures and degraded state estimation. On the [...]
Scaling up Robot Skill Learning with Generative Simulation
Abstract: Generalist robots need to learn a wide variety of skills to perform diverse tasks across multiple environments. Current robot training pipelines rely on humans to either provide kinesthetic demonstrations or program simulation environments with manually-designed reward functions for reinforcement learning. Such human involvement is an important bottleneck towards scaling up robot learning across diverse [...]
Simulation as a Tool for Conspicuity Measurement
Abstract: The use of unmanned aerial vehicles (UAVs) for time critical tasks is becoming increasingly popular. Operators are expected to use information from these swarms to make real-time and informed decisions. Consequently, detecting and recognizing targets from video is extremely pivotal to the success of these systems. At greater altitudes or with more vehicles, this [...]
VP4D: View Planning for 3D and 4D Scene Understanding
Abstract: View planning plays a critical role by gathering views that optimize scene reconstruction. Such reconstruction has played an important part in virtual production and computer animation, where a 3D map of the film set and motion capture of actors lead to an immersive experience. Current methods use uncertainty estimation in neural rendering of view [...]
Unlocking Generalization for Robotics via Modularity and Scale
Abstract: How can we build generalist robot systems? Looking at fields such as vision and language, the common theme has been large scale end-to-end learning with massive, curated datasets. In robotics, on the other hand, scale alone may not be enough due to the significant multimodality of robotics tasks, lack of easily accessible data and [...]
Automating Annotation Pipelines by leveraging Multi-Modal Data
Abstract: The era of vision-language models (VLMs) trained on large web-scale datasets challenges conventional formulations of “open-world" perception. In this work, we revisit the task of few-shot object detection (FSOD) in the context of recent foundational VLMs. First, we point out that zero-shot VLMs such as GroundingDINO significantly outperform state-of-the-art few-shot detectors (48 vs. 33 AP) [...]
Leveraging Affordances for Accelerating Online RL
Abstract: The inability to explore environments efficiently makes online RL sample-inefficient. Most existing works tackle this problem in a setting devoid of prior information. However, additional affordances may often be cheaply available at the time of training. These affordances include small quantities of demo data, simulators that can reset to arbitrary states and domain specific [...]
Dynamic Route Guidance in Vehicle Networks by Simulating Future Traffic Patterns
Abstract: Roadway congestion leads to wasted time and money and environmental damage. One possible solution is adding more roadway capacity, but this can be impractical especially in urban environments and still may not make up for a poorly-calibrated traffic signal schedule. As such, it is becoming increasingly important to use existing road networks more efficiently. [...]
Safe, Robust and Adaptive Model Learning for Agile Robots: Autonomous Racing
Abstract: In recent years there has been a rapid development in agile robots capable of operating at their limits in dynamic environments. Autonomous racing and recent developments in it also spurred by competitions such as the Indy Autonomous Challenge, A2RL, and F1Tenth have shown how modern autonomous control algorithms are capable of operating racecars at [...]
Improving Lego Assembly with Vibro-Tactile Feedback
Abstract: Robotic manipulation is an important area of research to improve the level of efficiency and autonomy in manufacturing processes. Due to the high precision and repeatability of industrial robot arms, robotic manufacturing tasks are dominated by simple pick, place, and peg insertion actions performed in a highly structured environment. Lego blocks are an excellent [...]
Robots Crossing Boundaries
Abstract: Over the last 50 years, autonomous robots have made the leap from being novel research contributions in university labs to becoming the fundamental technology upon which companies are built. While they traditionally have belonged to the engineering and computer science disciplines, robots have now crossed into other areas of study and research - making impacts in oceanography, geology, archaeology, biomechanics and biology. [...]
DeltaWalker: A Soft, Linearly Actuated Delta Quadruped Robot
Abstract: Quadruped robots offer a versatile solution for navigating complex terrain, making them valuable for applications such as industrial automation or search and rescue. Although quadrupeds are more complex than bipeds, they are easier to balance and control and require fewer joints to actuate compared to hexapods. Traditional quadruped designs, however, often feature complex leg [...]
Propagative Distance Optimization for Constrained Inverse Kinematics
Abstract: This work investigates a constrained inverse kinematic (IK) problem that seeks a feasible configuration of an articulated robot under various constraints such as joint limits and obstacle collision avoidance. Due to the high-dimensionality and complex constraints, this problem is often solved numerically via iterative local optimization. Classic local optimization methods take joint angles as [...]
Advancing Legged Robot Agility: from Video Imitation to GPU Acceleration
Abstract: Achieving human and animal-level agility has been a long-standing goal in robotics research. Recent advancements in numerical optimization and machine learning have pushed legged systems to greater capabilities than ever before, enabling black flips, parkour, and manipulation of heavy objects. Despite these exciting developments, this thesis identifies two key limitations of current legged robot [...]
Model Predictive Control on Resource-Constrained Robots
Abstract: Model predictive control (MPC) is a powerful tool for controlling highly dynamic robotic systems subject to complex constraints. However, it is computationally expensive and often requires a large memory footprint. Larger robotic systems are capable of carrying and powering sophisticated computational hardware onboard. On the other hand, smaller robots typically have faster dynamics that [...]
Enhancing Bipedal Locomotion With Reaction Wheels
Abstract: Legged robot hardware has become more accessible in the last ten years. However, there is still a dearth of low-cost hardware platforms that are open-source and easy to build. With recent developments in accessible manufacturing methods, such as 3D printing, it has become possible to design and manufacture parts without relying on precision machining. [...]
Building Micron: The Next Handheld Manipulator for Microsurgery
Abstract: Robotic assistance is used today in a variety of surgeries as a means of precise, dexterous, and minimally-invasive manipulation. However, practical use in microsurgical environments such as vitreoretinal surgery remains a challenge for the most common mechanically-grounded robotic platforms. Microsurgery requires micron-level accuracy and the ability to manipulate with interaction forces in millinewtons. Vitreoretinal [...]
Towards Estimation, Modeling, and Control of Mixed Material Flows on Variable-Speed Conveyor Belt Systems with Applications in Recycling
Abstract: Whether it is in sorting defects from grain in an agricultural setting, ore from tailings in a mine, or letters in a postal system, the sorting of bulk material has long been a crucial aspect of human industry. Today, in the face of dwindling natural resource deposits and accelerating climate change, a particularly important [...]
Expressive Attentional Communication Learning using Graph Neural Networks
Abstract: Multi-agent reinforcement learning presents unique hurdles such as the non-stationary problem beyond single-agent reinforcement learning that makes learning effective decentralized cooperative policies using an agent's local state extremely challenging. Effective communication to share information and coordinate is vital for agents to work together and solve cooperative tasks, as the ubiquitous evidence of communication in [...]
Estimating Object Importance and Modeling Driver’s Situational Awareness for Intelligent Driving
Abstract: The ability to identify important objects in a complex and dynamic driving environment can help assistive driving systems alert drivers. These assistance systems also require a model of the drivers' situational awareness (SA) (what aspects of the scene they are already aware of) to avoid unnecessary alerts. This thesis builds towards such intelligent driving [...]
Carnegie Mellon University
AI for Human Mobility
Abstract This talk will describe a series of AI and robotics projects aimed at helping people independently move through cities and buildings. Projects include a deployed personalized transit information app, guide robots for people who are blind, and an integrated AI system that assists blind users with guidance and exploration. Specific findings will be presented [...]
Learning for Perception and Strategy: Adaptive Omnidirectional Stereo Vision and Tactical Reinforcement Learning
Abstract: Multi-view stereo omnidirectional distance estimation usually needs to build a cost volume with many hypothetical distance candidates. The cost volume building process is often computationally heavy considering the limited resources a mobile robot has. We propose a new geometry-informed way of distance candidates selection method which enables the use of a very small number [...]
Online-Adaptive Self-Supervised Learning with Visual Foundation Models for Autonomous Off-Road Driving
Abstract: Autonomous robot navigation in off-road environments currently presents a number of challenges. The lack of structure makes it difficult to handcraft geometry-based heuristics that are robust to the diverse set of scenarios the robot might encounter. Many of the learned methods that work well in urban scenarios require massive amounts of hand-labeled data, but [...]
Multimodal Representations for Adaptable Robot Policies in Human-Inhabited Spaces
Abstract: Human beings sense and express themselves through multiple modalities. To capture multimodal ways of human communication, I want to build adaptable robot policies that infer task pragmatics from video and language prompts, reason about sounds and other sensors, take actions, and learn mannerisms of interacting with people and objects. Existing solutions for robot policies [...]
Interleaving Discrete Search and Continuous Optimization for Kinodynamic Motion Planning
Abstract: Motion planning for dynamically complex robotic tasks requires explicit reasoning within constraints on velocity, acceleration, force/torque, and kinematics such as avoiding obstacles. To meet these constraints, planning algorithms must simultaneously make high-level discrete decisions and low-level continuous decisions. For example, pushing a heavy object involves making discrete decisions about contact locations and continuous decisions [...]
Goal-Expressive Movement for Social Navigation: Where and When to Behave Legibly
Abstract: Robots often need to communicate their navigation goals to assist observers in anticipating the robot's future actions. Enabling observers to infer where a robot is going from its movements is particularly important as robots begin to share workplaces, sidewalks, and social spaces with humans. We can use legible motion, or movements that use intentional [...]
Eye Gaze for Intelligent Driving
Abstract: Intelligent vehicles have been proposed as one path to increasing traffic safety and reducing on-road crashes. Driving “intelligence” today takes many forms, ranging from simple blind spot occupancy or forward collision warnings to distance-aware cruise and all the way to full driving autonomy in certain situations. Primarily, these methods are outward-facing and operate on [...]
AI-CARING
AI-CARING is an NSF-sponsored institute, led by Georgia Tech, whose mission is to investigate, develop and evaluate AI technologies to help older adults live independently. The Institute focuses on providing reminders to the older adults and alerting caregivers when necessary, assisting older adults with tasks such as meal preparation, motivating them to exercise, providing conversational [...]
Learning to Perceive and Predict Everyday Interactions
Abstract: This thesis aims to build computer systems to understand everyday hand-object interactions in the physical world – both perceiving ongoing interactions in 3D space and predicting possible interactions. This ability is crucial for applications such as virtual reality, robotic manipulations, and augmented reality. The problem is inherently ill-posed due to the challenges of one-to-many [...]
Sensorized Soft Material Systems with Integrated Electronics and Computing
Abstract: The integration of soft and multifunctional materials in emerging technologies is becoming more widespread due to their ability to enhance or improve functionality in ways not possible using typical rigid alternatives. This trend is evident in various fields. For example, wearable technologies are increasingly designed using soft materials to improve modulus compatibility with biological [...]
Deep Learning for Tactile Sensing: Development to Deployment
Abstract: The role of sensing is widely acknowledged for robots interacting with the physical environment. However, few contemporary sensors have gained widespread use among roboticists. This thesis proposes a framework for incorporating sensors into a robot learning paradigm, from development to deployment, through the lens of ReSkin -- a versatile and scalable magnetic tactile sensor. [...]
Learning and Translating Temporal Abstractions of Behaviour across Humans and Robots
Abstract: Humans are remarkably adept at learning to perform tasks by imitating other people demonstrating these tasks. Key to this is our ability to reason abstractly about the high-level strategy of the task at hand (such as the recipe of cooking a dish) and the behaviours needed to solve this task (such as the behaviour [...]
Towards Underwater 3D Visual Perception
Abstract: With modern robotic technologies, seafloor imageries have become more accessible to both researchers and the public. This thesis leverages deep learning and 3D vision techniques to deliver valuable information from seafloor image observations. Despite the widespread use of deep learning and 3D vision algorithms across various fields, underwater imaging presents unique challenges, such as [...]
Assistive value alignment using in-situ naturalistic human behaviors
Abstract: As collaborative robots are increasingly deployed in personal environments, such as the home, it is critical they take actions to complete tasks consistent with personal preferences. Determining personal preferences for completing household chores, however, is challenging. Many household chores, such as setting a table or loading a dishwasher, are sequential and open-vocabulary, creating a [...]
Ice Cream Social
Join RISO at the Ice Cream Social robolounge @5-7 Wednesday September 4th Free Entry
Sampling and Signal-Processing for High-Dimensional Visual Appearance in Computer Graphics and Vision
Abstract: Many problems in computer graphics and vision, such as acquiring images of a scene to enable synthesis of novel views from many directions for virtual reality, computing realistic images by integrating lighting from many different incident directions across a range of scene pixels and viewing angles, or acquiring and modeling the appearance of realistic materials [...]
Teaching Robots to Drive: Scalable Policy Improvement via Human Feedback
Abstract: A long-standing problem in autonomous driving is grappling with the long-tail of rare scenarios for which little or no data is available. Although learning-based methods scale with data, it is unclear that simply ramping up data collection will eventually make this problem go away. Approaches which rely on simulation or world modeling offer some [...]
Exploration for Continually Improving Robots
Abstract: Data-driven learning is a powerful paradigm for enabling robots to learn skills. Current prominent approaches involve collecting large datasets of robot behavior via teleoperation or simulation, to then train policies. For these policies to generalize to diverse tasks and scenes, there is a large burden placed on constructing a rich initial dataset, which is [...]
Unlocking Magic: Personalization of Diffusion Models for Novel Applications
Abstract: Since the recent advent of text-to-image diffusion models for high-quality realistic image generation, a plethora of creative applications have suddenly become within reach. I will present my work at Google where I have attempted to unlock magical applications by proposing simple techniques that act on these large text-to-image diffusion models. Particularly, a large class of [...]
Domesticating Soft Robotics Research and Development with Accessible Biomaterials
Abstract: Current trends in robotics design and engineering are typically focused on high value applications where high performance, precision, and robustness take precedence over cost, accessibility, and environmental impact. In this paradigm, the capability landscape of robotics is largely shaped by access to capital and the promise of economic return. This thesis explores an alternative [...]
Understanding and acting in the 4D world
Abstract: As humans, we are constantly interacting with and observing a three-dimensional dynamic world; where objects around us change state as they move or are moved, and we, ourselves, move for navigation and exploration. Such an interaction between a dynamic environment and a dynamic ego-agent is complex to model as an ego-agent's perception of the [...]
Using mechanical intelligence to create adaptable robots
Abstract: Currently deployed robots are primarily rigid machines that perform repetitive, controlled tasks in highly constrained or open environments such as factory floors, warehouses, or fields. There is an increasing demand for more adaptable, mobile, and flexible robots that can manipulate or move through unstructured and dynamic environments. My vision is to create robots that [...]
Instant Visual 3D Worlds Through Split-Lohmann Displays
Abstract: Split-Lohmann displays provide a novel approach to creating instant visual 3D worlds that support realistic eye accommodation. Unlike commercially available VR headsets that show content at a fixed depth, the proposed display can optically place each pixel region to a different depth, instantly creating eye-tracking-free 3D worlds without using time-multiplexing. This enables real-time streaming [...]
Remote Rendering and 3D Streaming for Resource-Constrained XR Devices
Abstract: An overview of the motivation and challenges for remote rendering and real-time 3D video streaming on XR headsets. Bio: Edward is a third year PhD student in the ECE department interested in computer systems for VR/AR devices. Homepage: https://users.ece.cmu.edu/~elu2/ Sponsored in part by: Meta Reality Labs Pittsburgh
Vectorizing Raster Signals for Spatial Intelligence
Abstract: This seminar will focus on how vectorized representations can be generated from raster signals to enhance spatial intelligence. I will discuss the core methodology behind this transformation, with a focus on applications in AR/VR and robotics. The seminar will also briefly cover follow-up work that explores rigging and re-animating objects from casual single videos [...]
Learning Universal Humanoid Control
Abstract: Since infancy, humans acquire motor skills, behavioral priors, and objectives by learning from their caregivers. Similarly, as we create humanoids in our own image, we aspire for them to learn from us and develop universal physical and cognitive capabilities that are comparable to, or even surpass, our own. In this thesis, we explore how [...]
Generative Robotics: Self-Supervised Learning for Human-Robot Collaborative Creation
Abstract: While Generative AI has shown breakthroughs in recent years in generating new digital contents such as images or 3D models from high-level goal inputs like text, Robotics technologies have not, instead focusing on low-level goal inputs. We propose Generative Robotics, as a new field of robotics which combines the high-level goal input abilities of [...]
3D Video Models through Point Tracking, Reconstructing and Forecasting
Abstract: 3D scene understanding from 2D video is essential for enabling advanced applications such as autonomous driving, robotics, virtual reality, and augmented reality. These fields rely on accurate 3D spatial awareness and dynamic interaction modeling to navigate complex environments, manipulate objects, and provide immersive experiences. Unlike 2D, 3D training data is much less abundant, which [...]
What Makes Learning to Control Easy or Hard?
Abstract: Designing autonomous systems that are simultaneously high-performing, adaptive, and provably safe remains an open problem. In this talk, we will argue that in order to meet this goal, new theoretical and algorithmic tools are needed that blend the stability, robustness, and safety guarantees of robust control with the flexibility, adaptability, and performance of machine [...]
Towards a Robot Generalist through In-Context Learning and Abstractions
Abstract: The goal of this thesis is to discover AI processes that enhance cross-domain and cross-task generalization in intelligent robot agents. Unlike the dominant approach in contemporary robot learning, which pursues generalization primarily through scaling laws (increasing data and model size), we focus on identifying the best abstractions and representations in both perception and policy [...]
Vision-based Human Motion Modeling and Analysis
Abstract: Modern computer vision has achieved remarkable success in tasks such as detecting, segmenting, and estimating the pose of humans in images and videos, reaching or even surpassing human-level performance. However, they still face significant challenges in predicting and analyzing future human motion. This thesis explores how vision-based solutions can enhance the fidelity and accuracy [...]
Stochastic Graphics Primitives
Abstract: For decades computer graphics has successfully leveraged stochasticity to enable both expressive volumetric representations of participating media like clouds and efficient Monte Carlo rendering of large scale, complex scenes. In this talk, we’ll explore how these complementary forms of stochasticity (representational and algorithmic) may be applied more generally across computer graphics and vision. In [...]
Recent Progress in Graph-Search Methods for Multi-Robot-Arm Motion Planning
Abstract: An exciting frontier in robotic manipulation is the use of multiple arms at once. However, planning concurrent motions is a challenging task using current methods. A major obstacle is the high-dimensional state space of this planning problem, which renders many traditional motion planning algorithms impractical. This opens the door for alternatives to the common [...]
Physical Process-Informed Mapping for Robotic Exploration
Abstract: Mobile robots used for information gathering tasks rely on dense, predictive mapping of large-scale regions to determine where to take measurements. Current approaches to mapping commonly rely on Gaussian process regression to spatially correlate data, extrapolate from sparse samples, and estimate uncertainty. However, these approaches do not incorporate meaningful information about physical processes that [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Agenda was sent via a calendar invite.
Can Robots Based on Musculoskeletal Designs Better Interact With the World?
Abstract: Living robots represent a new frontier in engineering materials for robotic systems, incorporating biological living cells and synthetic materials into their design. These bio-hybrid robots are dynamic and intelligent, potentially harnessing living matter’s capabilities, such as growth, regeneration, morphing, biodegradation, and environmental adaptation. Such attributes position bio-hybrid devices as a transformative force in robotics [...]
Soft Wearable Haptic Devices for Ubiquitous Communication
Abstract: Haptic devices allow touch-based information transfer between humans and intelligent systems, enabling communication in a salient but private manner that frees other sensory channels. For such devices to become ubiquitous, their physical and computational aspects must be intuitive and unobtrusive. The amount of information that can be transmitted through touch is limited in large [...]
Reconstructing Everything
Abstract: The presentation will be about a long-running, perhaps quixotic effort to reconstruct all of the world's structures in 3D from Internet photos, why this is challenging, and why this effort might be useful in the era of generative AI. Bio: Noah Snavely is a Professor in the Computer Science Department at Cornell University [...]
Using Robotics, Imaging and AI to Tackle Apple Fruit Production: Crop Harvest and Fire Blight Disease, The Two Major Bottlenecks for U.S. Apple Producers
Abstract Temperate tree fruit production is a significant agricultural sector in the United States, encompassing a variety of fruits like apples, pears, cherries, peaches and plums. The U.S. is the second-largest producer of apples in the world, after China. Annual U.S. production is 10 - 11 billion pounds of apple. However, apple production is complicated [...]
Moving Lights and Cameras for Better 3D Perception of Indoor Scenes
Abstract: Decades of research on computer vision have highlighted the importance of active sensing -- where an agent controls the parameters of the sensors to improve perception. Research on active perception in the context of robotic manipulation has demonstrated many novel and robust sensing strategies involving a multitude of sensors like RGB and RGBD cameras [...]
Building Generalist Robots with Agility via Learning and Control: Humanoids and Beyond
Abstract: Recent breathtaking advances in AI and robotics have brought us closer to building general-purpose robots in the real world, e.g., humanoids capable of performing a wide range of human tasks in complex environments. Two key challenges in realizing such general-purpose robots are: (1) achieving "breadth" in task/environment diversity, i.e., the generalist aspect, and (2) [...]
High-Fidelity Neural Radiance Fields
Abstract: I will present three recent projects that focus on high-fidelity neural radiance fields for walkable VR spaces: VR-NeRF (SIGGRAPH Asia 2023) is an end-to-end system for the high-fidelity capture, model reconstruction, and real-time rendering of walkable spaces in virtual reality using neural radiance fields. To this end, we designed and built a custom multi-camera rig to [...]
Building Scalable Visual Intelligence: From Represention to Understanding and Generation
Abstract: In this talk, we will dive into our recent work on vision-centric generative AI, focusing on how it helps with understanding and creating visual content like images and videos. We'll cover the latest advances, including multimodal large language models for visual understanding and diffusion transformers for visual generation. We'll explore how these two areas [...]
Learning to create 3D content
Abstract: With the popularity of Virtual Reality (VR), Augmented Reality (AR), and other 3D applications, developing methods that let everyday users capture and create their own 3D content has become increasingly essential. Current 3D creation pipelines often require either tedious manual effort or specialized setups with densely captured views. Additionally, many resulting 3D models are [...]
Trustworthy Learning using Uncertain Interpretation of Data
Abstract: Motivated by the potential of Artificial Intelligence (AI) in high-cost and safety-critical applications, and recently also by the increasing presence of AI in our everyday lives, Trustworthy AI has grown in prominence as a broad area of research encompassing topics such as interpretability, robustness, verifiable safety, fairness, privacy, accountability, and more. This has created [...]
Robots That Know When They Don’t Know
Abstract: Foundation models from machine learning have enabled rapid advances in perception, planning, and natural language understanding for robots. However, current systems lack any rigorous assurances when required to generalize to novel scenarios. For example, perception systems can fail to identify or localize unfamiliar objects, and large language model (LLM)-based planners can hallucinate outputs that [...]
Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis
Abstract: This talk will present our approach for reconstructing objects from sparse-view images captured in unconstrained environments. In the absence of ground-truth camera poses, we will demonstrate how to utilize estimates from off-the-shelf systems and address two key challenges: refining noisy camera poses in sparse views and effectively handling outlier poses. Bio: Qitao is a second-year [...]
EgoTouch: On-Body Touch Input Using AR/VR Headset Cameras
Abstract: In augmented and virtual reality (AR/VR) experiences, a user’s arms and hands can provide a convenient and tactile surface for touch input. Prior work has shown on-body input to have significant speed, accuracy, and ergonomic benefits over in-air interfaces, which are common today. In this work, we demonstrate high accuracy, bare hands (i.e., no special [...]
Auptimize: Optimal Placement of Spatial Audio Cues for Extended Reality
Abstract: Spatial audio in Extended Reality (XR) provides users with better awareness of where virtual elements are placed, and efficiently guides them to events such as notifications, system alerts from different windows, or approaching avatars. Humans, however, are inaccurate in localizing sound cues, especially with multiple sources due to limitations in human auditory perception such as [...]
VoxDet: Voxel Learning for Novel Instance Detection
Abstract: Detecting unseen instances based on multi-view templates is a challenging problem due to its open-world nature. Traditional methodologies, which primarily rely on 2D representations and matching techniques, are often inadequate in handling pose variations and occlusions. To solve this, we introduce VoxDet, a pioneer 3D geometry-aware framework that fully utilizes the strong 3D voxel [...]
Voxel Learning for Novel Instance Detection
Abstract: Detecting unseen instances based on multi-view templates is a challenging problem due to its open-world nature. Traditional methodologies, which primarily rely on 2D representations and matching techniques, are often inadequate in handling pose variations and occlusions. To solve this, we introduce VoxDet, a pioneer 3D geometry-aware framework that fully utilizes the strong 3D voxel [...]
Sensorimotor-Aligned Design for Pareto-Efficient Haptic Immersion in Extended Reality
Abstract: A new category of computing devices is emerging: augmented and virtual reality headsets, collectively referred to as extended reality (XR). These devices can alter, augment, or even replace our reality. While these headsets have made impressive strides in audio-visual immersion over the past half-century, XR interactions remain almost completely absent of appropriately expressive tactile [...]
Evaluating and Improving Vision-Language Models Beyond Scaling Laws
Abstract: In this talk, we present our work on advancing Vision-Language Models (VLMs) beyond scaling laws through improved evaluation and (post-)training strategies. Our contributions include VQAScore, a state-of-the-art alignment metric for text-to-visual generation. We show how VQAScore improves visual generation under real-world user prompts in GenAI-Bench. Additionally, we explore training methods that leverage the language [...]
Whisker-Inspired Sensors for Unstructured Environments
Abstract: Robots lack the perception abilities of animals, which is one reason they can not achieve complex control in outdoor unstructured environments with the same ease as animals. One cause of the perception gap is the constraints researchers place on the environments in which they test new sensors so algorithms can correctly interpret data from [...]
Strategy and Skill Learning for Physics-based Table Tennis Animation
Abstract: Recent advancements in physics-based character animation leverage deep learning to generate agile and natural motion, enabling characters to execute movements such as backflips, boxing, and tennis. However, reproducing the selection and use of diverse motor skills in dynamic environments to solve complex tasks, as humans do, still remains a challenge. We present a strategy [...]
Abstraction Barriers for Embodied Algorithms
Abstract: Designing robotic systems to reliably modify their environment typically requires expert engineers and several design iterations. This talk will cover abstraction barriers that can be used to make the process of building such systems easier and the results more predictable. By focusing on approximate mathematical representations that model the process dynamics, these representations can [...]
Getting Optimization layers to play well with Deep Networks: Numerical methods and Architectures
Abstract: Many real-world challenges, from robotic control to resource management, can be effectively formulated as optimization problems. Recent advancements have focused on incorporating these optimization problems as layers within deep learning pipelines, enabling the explicit inclusion of auxiliary constraints or cost functions, which is crucial for applications such as enforcing physical laws, ensuring safety constraints, [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Agenda was sent via a calendar invite.
RI Seminar with Axel Krieger
A retrospective, 40 Years of Field Robotics
Abstract: Chuck has been building and deploying robots in the field for the past 40 years. In this retrospective he will touch on the robots, people and experiences that have been part of the journey. From the early days in the 1980s with the Three Mile Island nuclear robots and the first outdoor autonomy robots [...]
RI Seminar with Jeffrey Ichnowski
RI Faculty Business Meeting
Meeting for RI Faculty. Agenda was sent via a calendar invite.
Robotics Institute Winter Party
All Robotics Institute Faculty. Staff, Students, and Visitors are invited to attend this event. Please join us for food, beverages, and casual conversation with colleagues. A calendar invite including details will be sent closer to the event.
Robotics Institute Picnic
Please mark your calendars and plan to join us for the 2025 Robotics Institute Picnic! More information and RSVP e-vite to follow as we get closer to the event.