“What happens if . . . ” Learning to Predict the Effect of Forces in Images

Roozbeh Mottaghi, Mohammad Rastegari, Abhinav Gupta, and Ali Farhadi

Conference Paper, Proceedings of (ECCV) European Conference on Computer Vision, pp. 269 - 285, October, 2016

Abstract

What happens if one pushes a cup sitting on a table toward the edge of the table? How about pushing a desk against a wall? In this paper, we study the problem of understanding the movements of objects as a result of applying external forces to them. For a given force vector applied to a specific location in an image, our goal is to predict long-term sequential movements caused by that force. Doing so entails reasoning about scene geometry, objects, their attributes, and the physical rules that govern the movements of objects. We design a deep neural network model that learns long-term sequential dependencies of object movements while taking into account the geometry and appearance of the scene by combining Convolutional and Recurrent Neural Networks. Training our model requires a large-scale dataset of object movements caused by external forces. To build a dataset of forces in scenes, we reconstructed all images in SUN RGB-D dataset in a physics simulator to estimate the physical movements of objects caused by external forces applied to them. Our Forces in Scenes (ForScene) dataset contains 10,335 images in which a variety of external forces are applied to different types of objects resulting in more than 65,000 object movements represented in 3D. Our experimental evaluations show that the challenging task of predicting long-term movements of objects as their reaction to external forces is possible from a single image.

BibTeX

@conference{Mottaghi-2016-113336,
author = {Roozbeh Mottaghi and Mohammad Rastegari and Abhinav Gupta and Ali Farhadi},
title = {“What happens if . . . ” Learning to Predict the Effect of Forces in Images},
booktitle = {Proceedings of (ECCV) European Conference on Computer Vision},
year = {2016},
month = {October},
pages = {269 - 285},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.