Effective Structural Adaptation of LVCSR Systems to Unseen Domains Using Hierarchical Connectionist Acoustic Models
Abstract
We present an approach to efficiently and effectively downsize and adapt the structure of large vocabulary conversational speech recognition (LVCSR) systems to unseen domains, requiring only small amounts of transcribed adaptation data. Our approach aims at bringing todays mostly task dependent systems closer to the aspired goal of domain independence. To achieve this, we rely on the ACID/HNN framework [2, 3], a hierarchical connectionist modeling paradigm which allows to dynamically adapt a tree structured modeling hierarchy to differing specifity of phonetic context in new domains. Experimental validation of the proposed approach has been carried out by adapting size and structure of ACID/HNN based acoustic models trained on Switchboard to two quite different, unseen domains, Wall Street Journal and an English Spontaneous Scheduling Task. In both cases, our approach yields considerably downsized acoustic models with performance improvements of up to 18 % over the unadapted baseline models.
BibTeX
@conference{Fritsch-1998-14817,author = {Juergen Fritsch and Michael Finke and Alex Waibel},
title = {Effective Structural Adaptation of LVCSR Systems to Unseen Domains Using Hierarchical Connectionist Acoustic Models},
booktitle = {Proceedings of 5th International Conference on Spoken Language Processing (ICSLP '98)},
year = {1998},
month = {December},
}