Improving visual noise insensitivity in small vocabulary audio-visual speech recognition applications

S. Lucey, S. Sridharan, and V. Chandran

Conference Paper, Proceedings of 6th International Symposium on Signal Processing and its Applications (ISSPA '01), Vol. 2, pp. 434 - 437, August, 2001

Abstract

Visual noise insensitivity is important to audio visual speech recognition (AVSR). Visual noise can take on a number of forms such as varying frame rate, occlusion, lighting or speaker variabilities. The use of a high dimensional secondary classifier on the word likelihood scores from both the audio and video modalities is investigated for the purposes of adaptive fusion. Preliminary results are presented demonstrating performance above the catastrophic fusion boundary for our confidence measure irrespective of the type of visual noise presented to it. Our experiments were restricted to small vocabulary applications.

BibTeX

@conference{Lucey-2001-121093,
author = {S. Lucey and S. Sridharan and V. Chandran},
title = {Improving visual noise insensitivity in small vocabulary audio-visual speech recognition applications},
booktitle = {Proceedings of 6th International Symposium on Signal Processing and its Applications (ISSPA '01)},
year = {2001},
month = {August},
volume = {2},
pages = {434 - 437},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.