Partition Min-Hash for Partial Duplicate Image Discovery
Conference Paper, Proceedings of (ECCV) European Conference on Computer Vision, pp. 648 - 662, September, 2010
Abstract
In this paper, we propose Partition min-Hash (PmH), a novel hashing scheme for discovering partial duplicate images from a large database. Unlike the standard min-Hash algorithm that assumes a bag of words image representation, our approach utilizes the fact that duplicate regions among images are often localized. By theoretical analysis, simulation, and empirical study, we show that PmH outperforms standard min-Hash in terms of precision and recall, while being orders of magnitude faster. When combined with the start-of-the-art Geometric min-Hash algorithm, our approach speeds up hashing by 10 times without losing precision or recall. When given a fixed time budget, our method achieves much higher recall than the state-of-the-art.
BibTeX
@conference{Lee-2010-10538,author = {David Changsoo Lee and Qifa Ke and Michael Isard},
title = {Partition Min-Hash for Partial Duplicate Image Discovery},
booktitle = {Proceedings of (ECCV) European Conference on Computer Vision},
year = {2010},
month = {September},
pages = {648 - 662},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.