Computer Vision without Features - Robotics Institute Carnegie Mellon University
Loading Events

VASC Seminar

July

20
Tue
Simon Lucey Senior Research Scientist Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia
Tuesday, July 20
3:00 pm to 4:00 pm
Computer Vision without Features

Event Location: NSH 1305
Bio: Simon Lucey is currently a Senior Research Scientist in Australia’s Commonwealth Scientific and Industrial Research Organisation (CSIRO) and is a current “Futures Fellow Award” recipient from the Australian Research Council. He is also an Assistant Research Professor (currently on leave) at the Robotics Institute at Carnegie Mellon University, and has been a faculty member there from late 2005. Dr. Lucey is also an Associate Professor (adjunct) at the University of Sydney and at the Queensland University of Technology (QUT). He received his Ph.D. in 2003 on the topic of audio-visual speaker and speech recognition from the Queensland University of Technology (QUT), Australia. To his credit he has over 50 publications in international conferences, journals and book chapters. His research has been covered in the popular press including news.com.au, iTWire and ACM. He recently co-chaired the Auditory and Visual Speech Processing (AVSP) Conference 2009, and has been organizer for a number of workshops and symposium in the fields of vision and learning. He is currently the local organizing chair for IEEE International Conference on Computer Vision (ICCV) 2013 in Sydney Australia.

Abstract: Features are so central to nearly all aspects of modern computer vision that it is easy to take them for granted. Common feature representations in vision include filter responses (e.g., Gabor, edge, Harr, etc.) and histogram features (e.g., HOG, LBP, SIFT, etc.). In this talk we will be looking at the role of features in computer vision tasks, and whether, for some tasks, that they are necessary at all?

Specifically, we will be looking at filter derived features (i.e. banks of linear filters) employed as a preprocessing step before optimizing some learning goal in vision such as classification or alignment. Often the choice of these filters involve both: (i) large computational and memory requirements due to increased feature size, and (ii) heuristic assumptions about what filters work best for specific applications (e.g., Gabor filters, edge filters, Harr filters, etc.). A central concept of our work is that if our learning goal can be expressed as an L2 norm, and our feature extraction step linear, then the sequential feature extraction and optimization steps can be subsumed within a single learning goal. This alternative view of linear feature extraction with respect to an L2 learning goal has a number of advantages.
– For classification using linear SVMs much of the computational overheads now disappear. From a theoretical perspective the feature extraction step can now be viewed alternatively as manipulating the margin of the SVM.
– For alignment we demonstrate that the well known and L2-norm based Lucas & Kanade (LK) algorithm can take advantage of this insight, resulting in an extremely efficient and robust extension which we refer to as “Fourier Lucas & Kanade” (FLK) algorithm.
– Finally, we demonstrate how the task of learning “the best filters/features” can be re-interpreted as a distance metric learning problem. This latter part of our work removes the heuristics and guess work involved in selecting a particular class of filters for a specific learning goal.