Scene-Space Encoding within the Functional Scene-Selective Network - Robotics Institute Carnegie Mellon University

Scene-Space Encoding within the Functional Scene-Selective Network

Elissa Aminoff, Mariya Toneva, Abhinav Gupta, and Michael Tarr
Journal Article, Journal of Vision, Vol. 15, No. 12, September, 2015

Abstract

High-level visual neuroscience has often focused on how different visual categories are encoded in the brain. For example, we know how the brain responds when viewing scenes as compared to faces or other objects – three regions are consistently engaged: the parahippocampal/lingual region (PPA), the retrosplenial complex (RSC), and the occipital place area/transverse occipital sulcus (TOS). Here we explore the fine-grained responses of these three regions when viewing 100 different scenes. We asked: 1) Can neural signals differentiate the 100 exemplars? 2) Are the PPA, RSC, and TOS strongly activated by the same exemplars and, more generally, are the “scene-spaces” representing how scenes are encoded in these regions similar? In an fMRI study of 100 scenes we found that the scenes eliciting the greatest BOLD signal were largely the same across the PPA, RSC, and TOS. Remarkably, the orderings, from strongest to weakest, of scenes were highly correlated across all three regions (r = .82), but were only moderately correlated with non-scene selective brain regions (r = .30). The high similarity across scene-selective regions suggests that a reliable and distinguishable feature space encodes visual scenes. To better understand the potential feature space, we compared the neural scene-space to scene-spaces defined by either several different computer vision models or behavioral measures of scene similarity. Computer vision models that rely on more complex, mid- to high-level visual features best accounted for the pattern of BOLD signal in scene-selective regions and, interestingly, the better-performing models exceeded the performance of our behavioral measures. These results suggest a division of labor where the representations within the PPA and TOS focus on visual statistical regularities within scenes, whereas the representations within the RSC focus on a more high-level representation of scene category. Moreover, the data suggest the PPA mediates between the processing of the TOS and RSC.

Notes
Meeting abstract presented at 15th Annual VSS '15

BibTeX

@article{Aminoff-2015-121575,
author = {Elissa Aminoff and Mariya Toneva and Abhinav Gupta and Michael Tarr},
title = {Scene-Space Encoding within the Functional Scene-Selective Network},
journal = {Journal of Vision},
year = {2015},
month = {September},
volume = {15},
number = {12},
}