Multimodal Interfaces for Multimedia Information Agents
Abstract
When humans communicate they take advantage of a rich spectrum of cues. Some are verbal and acoustic. Some are non-verbal and non-acoustic. Signal processing technology has devoted much attention to the recognition of speech, as a single human communication signal. Most other complementary communication cues, however, remain unexplored and unused in human-computer interaction. In this paper we show that the addition of non-acoustic or non-verbal cues can significantly enhance robustness, flexibility, naturalness and performance of human-computer interaction. We demonstrate computer agents that use speech, gesture, handwriting, pointing, spelling jointly for more robust, natural and flexible human-computer interaction in the various tasks of an information worker: information creation, access, manipulation or dissemination.
BibTeX
@conference{-1997-16442,author = {Alex Waibel and Bernhard Suhm and Minh Tue Vo and Jie Yang},
title = {Multimodal Interfaces for Multimedia Information Agents},
booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)},
year = {1997},
month = {April},
volume = {1},
pages = {167 - 170},
}