Loading Events

MSR Speaking Qualifier

July

19
Tue
Qichen Fu Robotics Institute,
Carnegie Mellon University
Tuesday, July 19
9:00 am to 10:00 am
NSH 3305
MSR Thesis Talk: Qichen Fu

Date: Tuesday, July 19, 2022

Time: 9:00 AM – 10:00 AM ET

Location: Newell-Simon Hall (NSH) 3305

Title: Detect Active Object in a Sequential Voting Process

Abstract:

A key component of understanding hand-object interactions is the ability to identify the active object — the object that is being manipulated by the human hand. In order to accurately localize the active object, any method must reason using information encoded by each image pixel, such as whether it belongs to the hand, the object, or the background. To leverage each pixel as evidence to determine the bounding box of the active object, we propose a pixel-wise voting function. Our pixel-wise voting function takes an initial bounding box as input and produces an improved bounding box of the active object as output. The voting function is designed so that each pixel inside of the input bounding box votes for an improved bounding box, and the box with the majority vote is selected as the output. We define the collection of bounding boxes generated inside of the voting function, the Relational Box Field, as it characterizes a field of bounding boxes defined in relationship to the current bounding box. While our voting function is able to improve the bounding box of the active object, one round of voting is typically not enough to accurately localize the active object. Therefore, we repeatedly apply the voting function to sequentially improve the location of the bounding box. However, since it is known that repeatedly applying a one-step predictor (i.e., auto-regressive processing with our voting function) can cause a data distribution shift, we mitigate this issue using reinforcement learning (RL). We adopt standard RL to learn the voting function parameters and show that it provides a meaningful improvement over a standard supervised learning approach. We perform experiments on two large-scale datasets: 100DOH and MECCANO, improving AP50 performance by 8% and 30%, respectively, over the state of the art.

Committee:

Prof. Kris Kitani (advisor)

Prof. Abhinav Gupta

Yufei Ye

 

Zoom: https://cmu.zoom.us/j/96006292530?pwd=RTgwNGdOZys0M0VTUi8yelVqQnVHQT09

Meeting ID: 960 0629 2530

Passcode: 999054