From Saturation to Zero-Shot Visual Relationship Detection Using Local Context

Nikolaos Gkanatsios, Vassilis Pitsikalis, and Petros Maragos

Conference Paper, Proceedings of 31st British Machine Vision Virtual Conference (BMVC '20), September, 2020

Abstract

Visual relationship detection has been motivated by the "insufficiency of objects to describe rich visual knowledge". However, we find that training and testing on current popular datasets may not support such statements; most approaches can be outperformed by a naive image-agnostic baseline that fuses language and spatial features. We visualize the errors of numerous existing detectors, to discover that most of them are caused by the coexistence and penalization of antagonizing predicates that could describe the same interaction. Such annotations hurt the dataset's causality and models tend to overfit the dataset biases, resulting in a saturation of accuracy to artificially low levels. We construct a simple architecture and explore the effect of using language on generalization. Then, we introduce adaptive local-context-aware classifiers, that are built on-the-fly based on the objects' categories. To improve context awareness, we mine and learn predicate synonyms, i.e. different predicates that could equivalently hold, and apply a distillation-like loss that forces synonyms to have similar classifiers and scores. The last also serves as a regularizer that mitigates the dominance of the most frequent classes, enabling zero-shot generalization. We evaluate predicate accuracy on existing and novel test scenarios to display state-of-the-art results over prior biased baselines.

BibTeX

@conference{Gkanatsios-2020-124913,
author = {Nikolaos Gkanatsios and Vassilis Pitsikalis and Petros Maragos},
title = {From Saturation to Zero-Shot Visual Relationship Detection Using Local Context},
booktitle = {Proceedings of 31st British Machine Vision Virtual Conference (BMVC '20)},
year = {2020},
month = {September},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.