Multimodal Interfaces for Multimedia Information Agents - Robotics Institute Carnegie Mellon University

Multimodal Interfaces for Multimedia Information Agents

Alex Waibel, Bernhard Suhm, Minh Tue Vo, and Jie Yang
Conference Paper, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), Vol. 1, pp. 167 - 170, April, 1997

Abstract

When humans communicate they take advantage of a rich spectrum of cues. Some are verbal and acoustic. Some are non-verbal and non-acoustic. Signal processing technology has devoted much attention to the recognition of speech, as a single human communication signal. Most other complementary communication cues, however, remain unexplored and unused in human-computer interaction. In this paper we show that the addition of non-acoustic or non-verbal cues can significantly enhance robustness, flexibility, naturalness and performance of human-computer interaction. We demonstrate computer agents that use speech, gesture, handwriting, pointing, spelling jointly for more robust, natural and flexible human-computer interaction in the various tasks of an information worker: information creation, access, manipulation or dissemination.

BibTeX

@conference{-1997-16442,
author = {Alex Waibel and Bernhard Suhm and Minh Tue Vo and Jie Yang},
title = {Multimodal Interfaces for Multimedia Information Agents},
booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)},
year = {1997},
month = {April},
volume = {1},
pages = {167 - 170},
}