Data-Driven Learning Towards Perceptual Organization
Abstract: Computer vision has advanced rapidly with deep learning, achieving above human performance on some classification benchmarks. At the core of the state-of-the-art approaches for image classification, object detection, and semantic/instance segmentation is sliding window classification, engineered for computational efficiency. Such piecemeal analysis of visual perception often has trouble getting details right and fails miserably [...]
Learning to Drive
Abstract: Why is our understanding of sensorimotor control behind our understanding of perception? I will talk about structural differences between perception and control, and how these differences can be mitigated to help advance sensorimotor control systems. Judicious use of simulation can play an important role and I will describe some simulation tools that we have [...]
Monocular Depth Reconstruction using Geometry and Deep Networks
In this thesis, we explore methods of building dense depth map from monocular video. First, we introduce our multi-view stereo pipeline, which utilizes photometric bundle adjustment for getting accurate depth of textured regions from small motion video. Second, we improve the depth estimation of low-texture region by fusing deep convolutional network predictions. We categorize the [...]
Carnegie Mellon University
Liquid Metal-Microelectronics Integration for a Sensorized Soft Robot Skin
Abstract: Progress in the emerging field of soft robotics depends on the integration of sensors that are capable of sensing, power regulation, and signal processing. Commercially available microelectronics are well suited for these needs, as well as small enough to preserve the natural mechanics of a host system. Here, we present a method for integrating [...]
Carnegie Mellon University
Learning Depth from Monocular Videos using Direct Methods
The ability to predict depth from a single image - using recent advances in CNNs - is of increasing interest to the vision community. Unsupervised strategies to learning are particularly appealing as they can utilize much larger and varied monocular video datasets during learning without the need for ground truth depth or stereo. In previous works, separate pose and [...]
Carnegie Mellon University
Probabilistic Approaches for Pose Estimation
Abstract: Virtually all robotics and computer vision applications require some form of pose estimation; such as registration, structure from motion, sensor calibration, etc. This problem is challenging because it is highly nonlinear and nonconvex. A fundamental contribution of this thesis is the development of fast and accurate pose estimation by formulating in a parameter space [...]
Carnegie Mellon University
Learning-based Lane Following and Changing Behaviors for Autonomous Vehicle
This thesis explores learning-based methods in generating human-like lane following and changing behaviors in on-road autonomous driving. We summarize our main contributions as: 1) derive an efficient vision-based end-to-end learning system for on-road driving; 2) propose a novel attention-based learning architecture with sub-action space to obtain lane changing behavior using a deep reinforcement learning algorithm; [...]
Carnegie Mellon University
Real-to-Virtual Domain Unification for End-to-End Autonomous Driving
Abstract: In the spectrum of vision-based autonomous driving, vanilla end-to-end models are not interpretable and suboptimal in performance, while mediated perception models require additional intermediate representations such as segmentation masks or detection bounding boxes, whose annotation can be prohibitively expensive as we move to a larger scale. More critically, all prior works fail to deal with the notorious [...]
Carnegie Mellon University
Reconstruction of dynamic vehicles from multiple unsynchronized cameras
Despite significant research in the area, reconstruction of multiple dynamic rigid objects (eg. vehicles) observed from wide-baseline, uncalibrated and unsynchronized cameras, remains hard. On one hand, feature tracking works well within each view but is hard to correspond across multiple cameras with limited overlap in fields of view or due to occlusions. On the other [...]
Carnegie Mellon University
Algorithms for Timing and Sequencing Behaviors in Robotic Swarms
Abstract: Robotic swarms are multi-robot systems whose global behavior emerges from local interactions between individual robots and spatially proximal neighboring robots. Each robot can be programmed with several local control laws that can be activated depending on an operator's choice of global swarm behavior (e.g. flocking, aggregation, formation control, area coverage). In contrast to other [...]